Linear Regression & False Discovery Rate (FDR)¶

In [52]:
#import all necessary libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import itertools
from scipy.stats import kstest
from sklearn.preprocessing import StandardScaler
In [53]:
df = pd.read_csv("fundamentals.csv")

Data Exploration and Visualization¶

Explore the “fundamentals.csv”. Include any other plots you find interesting.¶

In [54]:
df.shape
Out[54]:
(1781, 79)
In [55]:
pd.set_option('display.max_columns', 80)
pd.set_option('display.max_rows', 80)
In [56]:
df.head(3)
Out[56]:
Unnamed: 0 Ticker Symbol Period Ending Accounts Payable Accounts Receivable Add'l income/expense items After Tax ROE Capital Expenditures Capital Surplus Cash Ratio Cash and Cash Equivalents Changes in Inventories Common Stocks Cost of Revenue Current Ratio Deferred Asset Charges Deferred Liability Charges Depreciation Earnings Before Interest and Tax Earnings Before Tax Effect of Exchange Rate Equity Earnings/Loss Unconsolidated Subsidiary Fixed Assets Goodwill Gross Margin Gross Profit Income Tax Intangible Assets Interest Expense Inventory Investments Liabilities Long-Term Debt Long-Term Investments Minority Interest Misc. Stocks Net Borrowings Net Cash Flow Net Cash Flow-Operating Net Cash Flows-Financing Net Cash Flows-Investing Net Income Net Income Adjustments Net Income Applicable to Common Shareholders Net Income-Cont. Operations Net Receivables Non-Recurring Items Operating Income Operating Margin Other Assets Other Current Assets Other Current Liabilities Other Equity Other Financing Activities Other Investing Activities Other Liabilities Other Operating Activities Other Operating Items Pre-Tax Margin Pre-Tax ROE Profit Margin Quick Ratio Research and Development Retained Earnings Sale and Purchase of Stock Sales, General and Admin. Short-Term Debt / Current Portion of Long-Term Debt Short-Term Investments Total Assets Total Current Assets Total Current Liabilities Total Equity Total Liabilities Total Liabilities & Equity Total Revenue Treasury Stock For Year Earnings Per Share Estimated Shares Outstanding
0 0 AAL 2012-12-31 3.068000e+09 -222000000.0 -1.961000e+09 23.0 -1.888000e+09 4.695000e+09 53.0 1.330000e+09 0.0 127000000.0 1.049900e+10 78.0 0.0 223000000.0 1.001000e+09 -1.813000e+09 -2.445000e+09 0.0 0.0 1.340200e+10 0.000000e+00 58.0 1.435600e+10 -569000000.0 8.690000e+08 632000000.0 5.800000e+08 3.060000e+08 4.730000e+08 7.116000e+09 0.0 0.0 0.0 -1.020000e+09 197000000.0 1.285000e+09 4.830000e+08 -1.571000e+09 -1.876000e+09 2.050000e+09 -1.876000e+09 -4.084000e+09 1.124000e+09 386000000.0 1.480000e+08 1.0 2.167000e+09 6.260000e+08 4.524000e+09 -2.980000e+09 1.509000e+09 11000000.0 1.514700e+10 -141000000.0 8.450000e+08 10.0 31.0 8.0 72.0 0.0 -9.462000e+09 0.000000e+00 1.297700e+10 1.419000e+09 3.412000e+09 2.351000e+10 7.072000e+09 9.011000e+09 -7.987000e+09 2.489100e+10 1.690400e+10 2.485500e+10 -367000000.0 2012.0 -5.60 3.350000e+08
1 1 AAL 2013-12-31 4.975000e+09 -93000000.0 -2.723000e+09 67.0 -3.114000e+09 1.059200e+10 75.0 2.175000e+09 0.0 5000000.0 1.101900e+10 104.0 0.0 935000000.0 1.020000e+09 -1.324000e+09 -2.180000e+09 0.0 0.0 1.925900e+10 4.086000e+09 59.0 1.572400e+10 -346000000.0 2.311000e+09 856000000.0 1.012000e+09 -1.181000e+09 -2.350000e+08 1.535300e+10 0.0 0.0 0.0 2.208000e+09 660000000.0 6.750000e+08 3.799000e+09 -3.814000e+09 -1.834000e+09 1.873000e+09 -1.834000e+09 -4.489000e+09 1.560000e+09 559000000.0 1.399000e+09 5.0 2.299000e+09 1.465000e+09 7.385000e+09 -2.032000e+09 1.711000e+09 481000000.0 1.491500e+10 -56000000.0 8.530000e+08 8.0 80.0 7.0 96.0 0.0 -1.129600e+10 0.000000e+00 1.291300e+10 1.446000e+09 8.111000e+09 4.227800e+10 1.432300e+10 1.380600e+10 -2.731000e+09 4.500900e+10 4.227800e+10 2.674300e+10 0.0 2013.0 -11.25 1.630222e+08
2 2 AAL 2014-12-31 4.668000e+09 -160000000.0 -1.500000e+08 143.0 -5.311000e+09 1.513500e+10 60.0 1.768000e+09 0.0 7000000.0 1.562000e+10 88.0 0.0 829000000.0 1.342000e+09 4.099000e+09 3.212000e+09 0.0 0.0 2.308400e+10 4.091000e+09 63.0 2.703000e+10 330000000.0 2.240000e+09 887000000.0 1.004000e+09 1.799000e+09 -1.026000e+09 1.604300e+10 0.0 0.0 0.0 1.700000e+08 -146000000.0 3.080000e+09 -3.150000e+08 -2.911000e+09 2.882000e+09 5.420000e+08 2.882000e+09 2.882000e+09 1.771000e+09 800000000.0 4.249000e+09 10.0 2.060000e+09 8.980000e+08 7.059000e+09 -4.559000e+09 8.170000e+08 601000000.0 1.092800e+10 -500000000.0 1.295000e+09 8.0 159.0 7.0 80.0 0.0 -8.562000e+09 -1.052000e+09 2.068600e+10 1.677000e+09 6.309000e+09 4.322500e+10 1.175000e+10 1.340400e+10 2.021000e+09 4.120400e+10 4.322500e+10 4.265000e+10 0.0 2014.0 4.02 7.169154e+08
In [57]:
# dropping the unnecessary column
df.drop(['Unnamed: 0'], axis=1,inplace=True)
In [58]:
#replacing special characters in the column name with _
df.columns = df.columns.str.replace(' ', '_')
df.columns = df.columns.str.replace(',', '')
df.columns = df.columns.str.replace('.', '')
df.columns = df.columns.str.replace("'", "")
df.columns = df.columns.str.replace("-", "_")
df.columns = df.columns.str.replace("&", "and")
In [59]:
# checking for null values
df.isnull().sum()
Out[59]:
Ticker_Symbol                                            0
Period_Ending                                            0
Accounts_Payable                                         0
Accounts_Receivable                                      0
Addl_income/expense_items                                0
After_Tax_ROE                                            0
Capital_Expenditures                                     0
Capital_Surplus                                          0
Cash_Ratio                                             299
Cash_and_Cash_Equivalents                                0
Changes_in_Inventories                                   0
Common_Stocks                                            0
Cost_of_Revenue                                          0
Current_Ratio                                          299
Deferred_Asset_Charges                                   0
Deferred_Liability_Charges                               0
Depreciation                                             0
Earnings_Before_Interest_and_Tax                         0
Earnings_Before_Tax                                      0
Effect_of_Exchange_Rate                                  0
Equity_Earnings/Loss_Unconsolidated_Subsidiary           0
Fixed_Assets                                             0
Goodwill                                                 0
Gross_Margin                                             0
Gross_Profit                                             0
Income_Tax                                               0
Intangible_Assets                                        0
Interest_Expense                                         0
Inventory                                                0
Investments                                              0
Liabilities                                              0
Long_Term_Debt                                           0
Long_Term_Investments                                    0
Minority_Interest                                        0
Misc_Stocks                                              0
Net_Borrowings                                           0
Net_Cash_Flow                                            0
Net_Cash_Flow_Operating                                  0
Net_Cash_Flows_Financing                                 0
Net_Cash_Flows_Investing                                 0
Net_Income                                               0
Net_Income_Adjustments                                   0
Net_Income_Applicable_to_Common_Shareholders             0
Net_Income_Cont_Operations                               0
Net_Receivables                                          0
Non_Recurring_Items                                      0
Operating_Income                                         0
Operating_Margin                                         0
Other_Assets                                             0
Other_Current_Assets                                     0
Other_Current_Liabilities                                0
Other_Equity                                             0
Other_Financing_Activities                               0
Other_Investing_Activities                               0
Other_Liabilities                                        0
Other_Operating_Activities                               0
Other_Operating_Items                                    0
Pre_Tax_Margin                                           0
Pre_Tax_ROE                                              0
Profit_Margin                                            0
Quick_Ratio                                            299
Research_and_Development                                 0
Retained_Earnings                                        0
Sale_and_Purchase_of_Stock                               0
Sales_General_and_Admin                                  0
Short_Term_Debt_/_Current_Portion_of_Long_Term_Debt      0
Short_Term_Investments                                   0
Total_Assets                                             0
Total_Current_Assets                                     0
Total_Current_Liabilities                                0
Total_Equity                                             0
Total_Liabilities                                        0
Total_Liabilities_and_Equity                             0
Total_Revenue                                            0
Treasury_Stock                                           0
For_Year                                               173
Earnings_Per_Share                                     219
Estimated_Shares_Outstanding                           219
dtype: int64
In [60]:
# we will drop all rows which has null values under it
df_final1 = df.dropna()
In [61]:
df_final1.shape
Out[61]:
(1299, 78)

We are now left with ~1300 rows of data which is not null

Next we check how many zero values are there in each of the columns

In [62]:
zero_df = pd.DataFrame((df_final1 == 0).sum())
sns.histplot(x=zero_df[0])
Out[62]:
<Axes: xlabel='0', ylabel='Count'>

Observation_1¶

We can see that although majority of the columns (>50) have no zeros but there are many columns which have a fair amount of zero values which we may need to treat later on if we have to use log transformation.

In [63]:
# we will convert the 'For year'
for_year = df_final1['For_Year'].astype(str)

# creating a boxplot for our target variable split by year
plt.figure(figsize=(5,3))
sns.boxplot(x=for_year, y=np.log(df_final1["Estimated_Shares_Outstanding"]), data=df_final1)
plt.title('Target Variable by Year')
plt.show()
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)

Observation_2¶

We can see that the distribution is pretty much same for the different years. Also we can see a more normal distribution and more controlled variability after transforming in to log values.

We also found an incorrect year '1215' which might be a data entry error.

Scatter plots with Variables¶

Looking at the scatter plot for the first variable this is to get an idea about the transformations and if we can see any linear relation.

In [64]:
sns.scatterplot(x=np.log(df_final1['Accounts_Payable']), y=np.log(df_final1["Estimated_Shares_Outstanding"]), alpha=0.5)
plt.xlabel("Accounts Payable")
plt.ylabel("Estimated Shares Outstanding")
plt.show()
print("correlation value is :", np.log(abs(df_final1["Accounts_Payable"])).corr(np.log(df_final1["Estimated_Shares_Outstanding"])))
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
correlation value is : 0.6399802608853331
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)

Observation_3¶

  • We can see a high correlation (0.634) when the variables are log transformed.
  • But we need to be careful since lot of our columns have zero values which can be problematic for log transformation.

We will check which columns those are and count of zeros

In [65]:
zero_df[zero_df[0]< 60].count()
Out[65]:
0    50
dtype: int64
In [66]:
filtered_columns = zero_df[zero_df[0]< 60].index.tolist()

There are 50 variables which have less than 5% of the column as zero values, so we will proceed with these columns only.

.

In [67]:
df_filtered = df_final1[[col for col in df_final1.columns if col in filtered_columns]]
In [68]:
df_filtered.head()
Out[68]:
Ticker_Symbol Period_Ending Accounts_Payable After_Tax_ROE Capital_Expenditures Cash_Ratio Cash_and_Cash_Equivalents Cost_of_Revenue Current_Ratio Depreciation Earnings_Before_Interest_and_Tax Earnings_Before_Tax Fixed_Assets Gross_Margin Gross_Profit Income_Tax Liabilities Net_Cash_Flow Net_Cash_Flow_Operating Net_Cash_Flows_Financing Net_Cash_Flows_Investing Net_Income Net_Income_Adjustments Net_Income_Applicable_to_Common_Shareholders Net_Income_Cont_Operations Net_Receivables Operating_Income Operating_Margin Other_Current_Assets Other_Equity Other_Investing_Activities Other_Liabilities Other_Operating_Activities Pre_Tax_Margin Pre_Tax_ROE Profit_Margin Quick_Ratio Retained_Earnings Sale_and_Purchase_of_Stock Sales_General_and_Admin Total_Assets Total_Current_Assets Total_Current_Liabilities Total_Equity Total_Liabilities Total_Liabilities_and_Equity Total_Revenue For_Year Earnings_Per_Share Estimated_Shares_Outstanding
0 AAL 2012-12-31 3.068000e+09 23.0 -1.888000e+09 53.0 1.330000e+09 1.049900e+10 78.0 1.001000e+09 -1.813000e+09 -2.445000e+09 1.340200e+10 58.0 1.435600e+10 -5.690000e+08 4.730000e+08 197000000.0 1.285000e+09 4.830000e+08 -1.571000e+09 -1.876000e+09 2.050000e+09 -1.876000e+09 -4.084000e+09 1.124000e+09 1.480000e+08 1.0 6.260000e+08 -2.980000e+09 11000000.0 1.514700e+10 -141000000.0 10.0 31.0 8.0 72.0 -9.462000e+09 0.000000e+00 1.297700e+10 2.351000e+10 7.072000e+09 9.011000e+09 -7.987000e+09 2.489100e+10 1.690400e+10 2.485500e+10 2012.0 -5.60 3.350000e+08
1 AAL 2013-12-31 4.975000e+09 67.0 -3.114000e+09 75.0 2.175000e+09 1.101900e+10 104.0 1.020000e+09 -1.324000e+09 -2.180000e+09 1.925900e+10 59.0 1.572400e+10 -3.460000e+08 -2.350000e+08 660000000.0 6.750000e+08 3.799000e+09 -3.814000e+09 -1.834000e+09 1.873000e+09 -1.834000e+09 -4.489000e+09 1.560000e+09 1.399000e+09 5.0 1.465000e+09 -2.032000e+09 481000000.0 1.491500e+10 -56000000.0 8.0 80.0 7.0 96.0 -1.129600e+10 0.000000e+00 1.291300e+10 4.227800e+10 1.432300e+10 1.380600e+10 -2.731000e+09 4.500900e+10 4.227800e+10 2.674300e+10 2013.0 -11.25 1.630222e+08
2 AAL 2014-12-31 4.668000e+09 143.0 -5.311000e+09 60.0 1.768000e+09 1.562000e+10 88.0 1.342000e+09 4.099000e+09 3.212000e+09 2.308400e+10 63.0 2.703000e+10 3.300000e+08 -1.026000e+09 -146000000.0 3.080000e+09 -3.150000e+08 -2.911000e+09 2.882000e+09 5.420000e+08 2.882000e+09 2.882000e+09 1.771000e+09 4.249000e+09 10.0 8.980000e+08 -4.559000e+09 601000000.0 1.092800e+10 -500000000.0 8.0 159.0 7.0 80.0 -8.562000e+09 -1.052000e+09 2.068600e+10 4.322500e+10 1.175000e+10 1.340400e+10 2.021000e+09 4.120400e+10 4.322500e+10 4.265000e+10 2014.0 4.02 7.169154e+08
3 AAL 2015-12-31 5.102000e+09 135.0 -6.151000e+09 51.0 1.085000e+09 1.109600e+10 73.0 1.487000e+09 5.496000e+09 4.616000e+09 2.751000e+10 73.0 2.989400e+10 -2.994000e+09 -6.330000e+08 -604000000.0 6.249000e+09 -1.259000e+09 -5.594000e+09 7.610000e+09 -2.662000e+09 7.610000e+09 7.610000e+09 1.425000e+09 6.204000e+09 15.0 7.480000e+08 -4.732000e+09 114000000.0 1.017800e+10 95000000.0 11.0 82.0 19.0 67.0 -1.230000e+09 -3.846000e+09 2.127500e+10 4.841500e+10 9.985000e+09 1.360500e+10 5.635000e+09 4.278000e+10 4.841500e+10 4.099000e+10 2015.0 11.39 6.681299e+08
4 AAP 2012-12-29 2.409453e+09 32.0 -2.711820e+08 23.0 5.981110e+08 3.106967e+09 124.0 1.895440e+08 6.579150e+08 6.240740e+08 1.292547e+09 50.0 3.098036e+09 2.364040e+08 4.263230e+08 540210000.0 6.852810e+08 1.279070e+08 -2.729780e+08 3.876700e+08 2.331100e+07 3.876700e+08 3.876700e+08 2.298660e+08 6.573150e+08 11.0 4.761400e+07 2.667000e+06 -1796000.0 2.390210e+08 8213000.0 10.0 52.0 6.0 34.0 7.149000e+08 -1.860000e+07 2.440721e+09 4.613814e+09 3.184200e+09 2.559638e+09 1.210694e+09 3.403120e+09 4.613814e+09 6.205003e+09 2012.0 5.29 7.328355e+07

We will try to find the correlation between all variables and try to find the best variables¶

In [69]:
corr_df = df_filtered.drop(["Ticker_Symbol", 'Period_Ending'], axis=1).corr()
corr_df["Estimated_Shares_Outstanding"]
Out[69]:
Accounts_Payable                                0.518993
After_Tax_ROE                                  -0.030411
Capital_Expenditures                           -0.487570
Cash_Ratio                                      0.087224
Cash_and_Cash_Equivalents                       0.519088
Cost_of_Revenue                                 0.376881
Current_Ratio                                  -0.001455
Depreciation                                    0.602825
Earnings_Before_Interest_and_Tax                0.640557
Earnings_Before_Tax                             0.628960
Fixed_Assets                                    0.443940
Gross_Margin                                    0.081330
Gross_Profit                                    0.690821
Income_Tax                                      0.537103
Liabilities                                     0.149608
Net_Cash_Flow                                  -0.029694
Net_Cash_Flow_Operating                         0.731004
Net_Cash_Flows_Financing                       -0.483550
Net_Cash_Flows_Investing                       -0.620207
Net_Income                                      0.660966
Net_Income_Adjustments                          0.252053
Net_Income_Applicable_to_Common_Shareholders    0.660485
Net_Income_Cont_Operations                      0.628996
Net_Receivables                                 0.523625
Operating_Income                                0.655909
Operating_Margin                                0.019781
Other_Current_Assets                            0.537010
Other_Equity                                   -0.208536
Other_Investing_Activities                     -0.197971
Other_Liabilities                               0.570793
Other_Operating_Activities                     -0.193131
Pre_Tax_Margin                                  0.043430
Pre_Tax_ROE                                    -0.029950
Profit_Margin                                   0.058860
Quick_Ratio                                     0.051061
Retained_Earnings                               0.438246
Sale_and_Purchase_of_Stock                     -0.512954
Sales_General_and_Admin                         0.518767
Total_Assets                                    0.756219
Total_Current_Assets                            0.738505
Total_Current_Liabilities                       0.660835
Total_Equity                                    0.719912
Total_Liabilities                               0.697836
Total_Liabilities_and_Equity                    0.756288
Total_Revenue                                   0.498266
For_Year                                        0.005782
Earnings_Per_Share                             -0.051850
Estimated_Shares_Outstanding                    1.000000
Name: Estimated_Shares_Outstanding, dtype: float64
In [70]:
### We will only pick the top 25 variables (which have a correlation higher than 0.44)

corr_df[corr_df["Estimated_Shares_Outstanding"]> 0.43].shape
Out[70]:
(26, 48)

Observation_4¶

We have picked the 25 variables which have a strong correlation (> 0.44) with our target variable.

Now we will try to visualise the scatterplots for each of those 25 variables.

In [71]:
imp_features =  corr_df[corr_df["Estimated_Shares_Outstanding"]> 0.43].index.tolist()
In [72]:
#now we create a loop to draw scatter plots of the target variable with each of these 25 variables
# We will also log transform to be able to better visualize the relation.

for column in imp_features:
    if column != 'Estimated_Shares_Outstanding' and np.issubdtype(df_filtered[column].dtype, np.number): # this is to ignore target variable and only include float data type columns
        x_transformed = np.log(abs(df_filtered[column]))
        y_transformed = np.log(df_filtered["Estimated_Shares_Outstanding"])

        sns.scatterplot(x=x_transformed, y = y_transformed, alpha = 0.5)
        plt.xlabel(column)
        plt.ylabel("Estimated Shares Outstanding")
        plt.title(f"The log-log plot between {column} and Target Variable")

        plt.show()
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
/Users/kumarkishalaya/anaconda3/lib/python3.11/site-packages/pandas/core/arraylike.py:396: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)

Observation_5¶

We can see that lot of these 25 variables have a visible linear relation with the Target variable.

Our final DataFrame will have these 25 variables (with highest correlation) along with the target variable. (df_final)

We will also try to fit a model with 50 variables which had the least zero values. (df_filtered)

In [73]:
df_final = df_filtered[[col for col in df_filtered.columns if col in imp_features]]
In [74]:
# Dataframe with 25 variables
df_final.head()
Out[74]:
Accounts_Payable Cash_and_Cash_Equivalents Depreciation Earnings_Before_Interest_and_Tax Earnings_Before_Tax Fixed_Assets Gross_Profit Income_Tax Net_Cash_Flow_Operating Net_Income Net_Income_Applicable_to_Common_Shareholders Net_Income_Cont_Operations Net_Receivables Operating_Income Other_Current_Assets Other_Liabilities Retained_Earnings Sales_General_and_Admin Total_Assets Total_Current_Assets Total_Current_Liabilities Total_Equity Total_Liabilities Total_Liabilities_and_Equity Total_Revenue Estimated_Shares_Outstanding
0 3.068000e+09 1.330000e+09 1.001000e+09 -1.813000e+09 -2.445000e+09 1.340200e+10 1.435600e+10 -5.690000e+08 1.285000e+09 -1.876000e+09 -1.876000e+09 -4.084000e+09 1.124000e+09 1.480000e+08 6.260000e+08 1.514700e+10 -9.462000e+09 1.297700e+10 2.351000e+10 7.072000e+09 9.011000e+09 -7.987000e+09 2.489100e+10 1.690400e+10 2.485500e+10 3.350000e+08
1 4.975000e+09 2.175000e+09 1.020000e+09 -1.324000e+09 -2.180000e+09 1.925900e+10 1.572400e+10 -3.460000e+08 6.750000e+08 -1.834000e+09 -1.834000e+09 -4.489000e+09 1.560000e+09 1.399000e+09 1.465000e+09 1.491500e+10 -1.129600e+10 1.291300e+10 4.227800e+10 1.432300e+10 1.380600e+10 -2.731000e+09 4.500900e+10 4.227800e+10 2.674300e+10 1.630222e+08
2 4.668000e+09 1.768000e+09 1.342000e+09 4.099000e+09 3.212000e+09 2.308400e+10 2.703000e+10 3.300000e+08 3.080000e+09 2.882000e+09 2.882000e+09 2.882000e+09 1.771000e+09 4.249000e+09 8.980000e+08 1.092800e+10 -8.562000e+09 2.068600e+10 4.322500e+10 1.175000e+10 1.340400e+10 2.021000e+09 4.120400e+10 4.322500e+10 4.265000e+10 7.169154e+08
3 5.102000e+09 1.085000e+09 1.487000e+09 5.496000e+09 4.616000e+09 2.751000e+10 2.989400e+10 -2.994000e+09 6.249000e+09 7.610000e+09 7.610000e+09 7.610000e+09 1.425000e+09 6.204000e+09 7.480000e+08 1.017800e+10 -1.230000e+09 2.127500e+10 4.841500e+10 9.985000e+09 1.360500e+10 5.635000e+09 4.278000e+10 4.841500e+10 4.099000e+10 6.681299e+08
4 2.409453e+09 5.981110e+08 1.895440e+08 6.579150e+08 6.240740e+08 1.292547e+09 3.098036e+09 2.364040e+08 6.852810e+08 3.876700e+08 3.876700e+08 3.876700e+08 2.298660e+08 6.573150e+08 4.761400e+07 2.390210e+08 7.149000e+08 2.440721e+09 4.613814e+09 3.184200e+09 2.559638e+09 1.210694e+09 3.403120e+09 4.613814e+09 6.205003e+09 7.328355e+07

Applying Linear Regression¶

Create linear regression to predict Estimated Shares Outstanding. Explain your model.¶

In [75]:
import statsmodels.formula.api as smf
In [76]:
variables = '+'.join(df_final.columns.drop('Estimated_Shares_Outstanding'))
formula = f'Estimated_Shares_Outstanding ~ {variables}'
model25 =smf.ols(formula=formula,data=df_final).fit()
In [77]:
model25.summary()
Out[77]:
OLS Regression Results
Dep. Variable: Estimated_Shares_Outstanding R-squared: 0.784
Model: OLS Adj. R-squared: 0.780
Method: Least Squares F-statistic: 192.3
Date: Thu, 18 Jan 2024 Prob (F-statistic): 0.00
Time: 14:51:58 Log-Likelihood: -27764.
No. Observations: 1299 AIC: 5.558e+04
Df Residuals: 1274 BIC: 5.571e+04
Df Model: 24
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 9.016e+07 1.74e+07 5.167 0.000 5.59e+07 1.24e+08
Accounts_Payable -0.0576 0.007 -8.119 0.000 -0.072 -0.044
Cash_and_Cash_Equivalents -0.0236 0.008 -3.081 0.002 -0.039 -0.009
Depreciation 0.0018 0.017 0.107 0.915 -0.031 0.034
Earnings_Before_Interest_and_Tax -0.1279 0.072 -1.765 0.078 -0.270 0.014
Earnings_Before_Tax -0.1288 0.075 -1.711 0.087 -0.277 0.019
Fixed_Assets -0.0169 0.002 -8.423 0.000 -0.021 -0.013
Gross_Profit 0.0347 0.009 3.827 0.000 0.017 0.053
Income_Tax 0.2208 0.039 5.683 0.000 0.145 0.297
Net_Cash_Flow_Operating -0.0049 0.011 -0.457 0.648 -0.026 0.016
Net_Income 0.2925 0.213 1.374 0.170 -0.125 0.710
Net_Income_Applicable_to_Common_Shareholders -0.1964 0.211 -0.932 0.351 -0.610 0.217
Net_Income_Cont_Operations 0.1232 0.037 3.316 0.001 0.050 0.196
Net_Receivables -0.0446 0.005 -8.130 0.000 -0.055 -0.034
Operating_Income 0.0316 0.021 1.476 0.140 -0.010 0.074
Other_Current_Assets -0.0296 0.013 -2.197 0.028 -0.056 -0.003
Other_Liabilities 0.0013 0.006 0.203 0.839 -0.011 0.014
Retained_Earnings -0.0037 0.001 -2.851 0.004 -0.006 -0.001
Sales_General_and_Admin -0.0044 0.009 -0.481 0.630 -0.022 0.013
Total_Assets -0.0029 0.053 -0.055 0.956 -0.107 0.102
Total_Current_Assets 0.0358 0.003 11.081 0.000 0.029 0.042
Total_Current_Liabilities 0.0021 0.006 0.332 0.740 -0.010 0.014
Total_Equity 0.0116 0.018 0.651 0.515 -0.023 0.047
Total_Liabilities 0.0063 0.018 0.356 0.722 -0.028 0.041
Total_Liabilities_and_Equity 0.0180 0.035 0.507 0.612 -0.052 0.087
Total_Revenue -0.0018 0.001 -1.675 0.094 -0.004 0.000
Omnibus: 1068.044 Durbin-Watson: 1.343
Prob(Omnibus): 0.000 Jarque-Bera (JB): 115787.467
Skew: 3.148 Prob(JB): 0.00
Kurtosis: 48.822 Cond. No. 1.17e+16


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.17e+16. This might indicate that there are
strong multicollinearity or other numerical problems.
Explaination of the model (25 variables model)¶
  • The model25 has a R-squared value of 0.78 which means that this model is able to explain 78% of the variation of the Estimated_Shares_Outstanding value.
  • This model has comparatively less variables (all numeric) and has a high F-statistic. A high F-statistic model is generally more stable in its predictions

  • A lot of variables like 'Net_Income_Cont_Operations', 'Income_Tax', etc. have a positive coefficient. For example, Income_Tax has a coefficient of 0.2208, this means that an increase of 1 unit in Income_Tax will lead to a increase of 0.2208 units in the Estimated_Shares_Outstanding.

  • Similarly, a lot of variables like 'Cash_and_Cash_Equivalents', 'Earnings_Before_Tax', 'Fixed_Assets' etc. have a negative coefficient. For example, Fixed_Assets has a coefficient of -0.0169 - this means that an increase of 1 unit in Fixed_Assets will lead to a decrease of 0.0169 units in the Estimated_Shares_Outstanding.

We will now fit the linear Regression algorithm on all the 50 variables dataset¶
In [78]:
## Data Cleaning

df_filtered["For_Year"] = df_filtered["For_Year"].replace(1215, 2015)
df_filtered["For_Year"] = df_filtered["For_Year"].astype(str)
df_filtered.drop(columns='Period_Ending', inplace=True)
df_filtered.drop(columns='Ticker_Symbol', inplace=True)
/var/folders/ly/npw69p7x4lg1l7s7mm3y7twm0000gn/T/ipykernel_12089/2349184028.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_filtered["For_Year"] = df_filtered["For_Year"].replace(1215, 2015)
/var/folders/ly/npw69p7x4lg1l7s7mm3y7twm0000gn/T/ipykernel_12089/2349184028.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_filtered["For_Year"] = df_filtered["For_Year"].astype(str)
/var/folders/ly/npw69p7x4lg1l7s7mm3y7twm0000gn/T/ipykernel_12089/2349184028.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_filtered.drop(columns='Period_Ending', inplace=True)
/var/folders/ly/npw69p7x4lg1l7s7mm3y7twm0000gn/T/ipykernel_12089/2349184028.py:6: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_filtered.drop(columns='Ticker_Symbol', inplace=True)
In [79]:
#Running the Regression on 50 variables

variables = '+'.join(df_filtered.columns.drop('Estimated_Shares_Outstanding'))
formula = f'Estimated_Shares_Outstanding ~ {variables}'

model50 =smf.ols(formula=formula,data=df_filtered).fit()
model50.summary()
Out[79]:
OLS Regression Results
Dep. Variable: Estimated_Shares_Outstanding R-squared: 0.824
Model: OLS Adj. R-squared: 0.817
Method: Least Squares F-statistic: 121.9
Date: Thu, 18 Jan 2024 Prob (F-statistic): 0.00
Time: 14:51:59 Log-Likelihood: -27630.
No. Observations: 1299 AIC: 5.536e+04
Df Residuals: 1250 BIC: 5.561e+04
Df Model: 48
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 3.161e+08 5.82e+07 5.435 0.000 2.02e+08 4.3e+08
For_Year[T.2013.0] -6.215e+07 3.95e+07 -1.575 0.115 -1.4e+08 1.53e+07
For_Year[T.2014.0] -6.176e+07 3.96e+07 -1.560 0.119 -1.39e+08 1.59e+07
For_Year[T.2015.0] -7.038e+07 4.01e+07 -1.754 0.080 -1.49e+08 8.33e+06
For_Year[T.2016.0] -8.142e+07 5.88e+07 -1.384 0.167 -1.97e+08 3.4e+07
Accounts_Payable -0.0492 0.007 -6.902 0.000 -0.063 -0.035
After_Tax_ROE -4.578e+05 5.41e+05 -0.846 0.398 -1.52e+06 6.04e+05
Capital_Expenditures 0.0441 0.016 2.831 0.005 0.014 0.075
Cash_Ratio 1.086e+06 3.97e+05 2.739 0.006 3.08e+05 1.86e+06
Cash_and_Cash_Equivalents -0.0078 0.009 -0.873 0.383 -0.025 0.010
Cost_of_Revenue -250.9281 168.673 -1.488 0.137 -581.841 79.985
Current_Ratio -5.199e+05 2.98e+05 -1.744 0.081 -1.1e+06 6.49e+04
Depreciation 0.0612 0.026 2.332 0.020 0.010 0.113
Earnings_Before_Interest_and_Tax -0.2454 0.070 -3.484 0.001 -0.384 -0.107
Earnings_Before_Tax -0.0086 0.078 -0.111 0.912 -0.161 0.144
Fixed_Assets -0.0067 0.003 -2.628 0.009 -0.012 -0.002
Gross_Margin 4.748e+05 7.62e+05 0.623 0.533 -1.02e+06 1.97e+06
Gross_Profit -250.8991 168.673 -1.487 0.137 -581.813 80.014
Income_Tax 0.1814 0.040 4.494 0.000 0.102 0.261
Liabilities -0.0017 0.020 -0.084 0.933 -0.042 0.038
Net_Cash_Flow -0.1372 0.056 -2.454 0.014 -0.247 -0.028
Net_Cash_Flow_Operating 0.1005 0.051 1.975 0.049 0.001 0.200
Net_Cash_Flows_Financing 0.0836 0.056 1.498 0.134 -0.026 0.193
Net_Cash_Flows_Investing 0.1627 0.056 2.914 0.004 0.053 0.272
Net_Income 0.6352 0.202 3.149 0.002 0.239 1.031
Net_Income_Adjustments 0.0507 0.024 2.081 0.038 0.003 0.098
Net_Income_Applicable_to_Common_Shareholders -0.4745 0.199 -2.380 0.017 -0.866 -0.083
Net_Income_Cont_Operations 0.1339 0.040 3.342 0.001 0.055 0.213
Net_Receivables -0.0391 0.006 -6.465 0.000 -0.051 -0.027
Operating_Income 0.0526 0.025 2.139 0.033 0.004 0.101
Operating_Margin -4.2e+06 2.08e+06 -2.023 0.043 -8.27e+06 -1.27e+05
Other_Current_Assets -0.0266 0.013 -2.074 0.038 -0.052 -0.001
Other_Equity 0.0006 0.008 0.069 0.945 -0.016 0.017
Other_Investing_Activities -0.0272 0.007 -3.779 0.000 -0.041 -0.013
Other_Liabilities -0.0053 0.006 -0.830 0.407 -0.018 0.007
Other_Operating_Activities 0.0064 0.032 0.200 0.841 -0.057 0.069
Pre_Tax_Margin 4.799e+06 2.82e+06 1.700 0.089 -7.4e+05 1.03e+07
Pre_Tax_ROE 3.088e+05 3.67e+05 0.841 0.401 -4.12e+05 1.03e+06
Profit_Margin -3.191e+06 3.18e+06 -1.005 0.315 -9.42e+06 3.04e+06
Quick_Ratio -2.401e+05 4.58e+05 -0.525 0.600 -1.14e+06 6.58e+05
Retained_Earnings -0.0063 0.002 -3.912 0.000 -0.010 -0.003
Sale_and_Purchase_of_Stock 0.0468 0.009 5.289 0.000 0.029 0.064
Sales_General_and_Admin -0.0097 0.009 -1.108 0.268 -0.027 0.007
Total_Assets -0.0019 0.050 -0.039 0.969 -0.100 0.096
Total_Current_Assets 0.0309 0.003 9.750 0.000 0.025 0.037
Total_Current_Liabilities -0.0008 0.006 -0.120 0.904 -0.013 0.012
Total_Equity 320.7287 181.665 1.765 0.078 -35.674 677.131
Total_Liabilities 320.7252 181.666 1.765 0.078 -35.678 677.128
Total_Liabilities_and_Equity -320.7022 181.663 -1.765 0.078 -677.100 35.696
Total_Revenue 250.9260 168.673 1.488 0.137 -79.987 581.839
Earnings_Per_Share -3.243e+07 2.98e+06 -10.869 0.000 -3.83e+07 -2.66e+07
Omnibus: 1183.733 Durbin-Watson: 1.443
Prob(Omnibus): 0.000 Jarque-Bera (JB): 166887.298
Skew: 3.663 Prob(JB): 0.00
Kurtosis: 58.043 Cond. No. 1.17e+16


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.17e+16. This might indicate that there are
strong multicollinearity or other numerical problems.
Explaination of the model (50 variables model)¶
  • The model25 has a R-squared value of 0.82 which means that this model is able to explain 82% of the variation of the Estimated_Shares_Outstanding value.

  • The higher R-square can also be a sign of overfitting since we are using a lot of variables. Although, it didn't significantly improve the R-square value

  • This model has comparatively high number of variables and has a lower F-statistic. A lower F-statistic model is generally less stable in its predictions of coefficients.

In [98]:
 
Out[98]:
'Estimated_Shares_Outstanding ~ Accounts_Payable + Cash_and_Cash_Equivalents + Depreciation + Earnings_Before_Interest_and_Tax + Earnings_Before_Tax + Fixed_Assets + Gross_Profit + Income_Tax + Net_Cash_Flow_Operating + Net_Income + Net_Income_Applicable_to_Common_Shareholders + Net_Income_Cont_Operations + Net_Receivables + Operating_Income + Other_Current_Assets + Other_Liabilities + Retained_Earnings + Sales_General_and_Admin + Total_Assets + Total_Current_Assets + Total_Current_Liabilities + Total_Equity + Total_Liabilities + Total_Liabilities_and_Equity + Total_Revenue + Accounts_Payable**2 + Cash_and_Cash_Equivalents**2 + Depreciation**2 + Earnings_Before_Interest_and_Tax**2 + Earnings_Before_Tax**2 + Fixed_Assets**2 + Gross_Profit**2 + Income_Tax**2 + Net_Cash_Flow_Operating**2 + Net_Income**2 + Net_Income_Applicable_to_Common_Shareholders**2 + Net_Income_Cont_Operations**2 + Net_Receivables**2 + Operating_Income**2 + Other_Current_Assets**2 + Other_Liabilities**2 + Retained_Earnings**2 + Sales_General_and_Admin**2 + Total_Assets**2 + Total_Current_Assets**2 + Total_Current_Liabilities**2 + Total_Equity**2 + Total_Liabilities**2 + Total_Liabilities_and_Equity**2 + Total_Revenue**2 + Accounts_Payable:Cash_and_Cash_Equivalents + Accounts_Payable:Depreciation + Accounts_Payable:Earnings_Before_Interest_and_Tax + Accounts_Payable:Earnings_Before_Tax + Accounts_Payable:Fixed_Assets + Accounts_Payable:Gross_Profit + Accounts_Payable:Income_Tax + Accounts_Payable:Net_Cash_Flow_Operating + Accounts_Payable:Net_Income + Accounts_Payable:Net_Income_Applicable_to_Common_Shareholders + Accounts_Payable:Net_Income_Cont_Operations + Accounts_Payable:Net_Receivables + Accounts_Payable:Operating_Income + Accounts_Payable:Other_Current_Assets + Accounts_Payable:Other_Liabilities + Accounts_Payable:Retained_Earnings + Accounts_Payable:Sales_General_and_Admin + Accounts_Payable:Total_Assets + Accounts_Payable:Total_Current_Assets + Accounts_Payable:Total_Current_Liabilities + Accounts_Payable:Total_Equity + Accounts_Payable:Total_Liabilities + Accounts_Payable:Total_Liabilities_and_Equity + Accounts_Payable:Total_Revenue + Cash_and_Cash_Equivalents:Depreciation + Cash_and_Cash_Equivalents:Earnings_Before_Interest_and_Tax + Cash_and_Cash_Equivalents:Earnings_Before_Tax + Cash_and_Cash_Equivalents:Fixed_Assets + Cash_and_Cash_Equivalents:Gross_Profit + Cash_and_Cash_Equivalents:Income_Tax + Cash_and_Cash_Equivalents:Net_Cash_Flow_Operating + Cash_and_Cash_Equivalents:Net_Income + Cash_and_Cash_Equivalents:Net_Income_Applicable_to_Common_Shareholders + Cash_and_Cash_Equivalents:Net_Income_Cont_Operations + Cash_and_Cash_Equivalents:Net_Receivables + Cash_and_Cash_Equivalents:Operating_Income + Cash_and_Cash_Equivalents:Other_Current_Assets + Cash_and_Cash_Equivalents:Other_Liabilities + Cash_and_Cash_Equivalents:Retained_Earnings + Cash_and_Cash_Equivalents:Sales_General_and_Admin + Cash_and_Cash_Equivalents:Total_Assets + Cash_and_Cash_Equivalents:Total_Current_Assets + Cash_and_Cash_Equivalents:Total_Current_Liabilities + Cash_and_Cash_Equivalents:Total_Equity + Cash_and_Cash_Equivalents:Total_Liabilities + Cash_and_Cash_Equivalents:Total_Liabilities_and_Equity + Cash_and_Cash_Equivalents:Total_Revenue + Depreciation:Earnings_Before_Interest_and_Tax + Depreciation:Earnings_Before_Tax + Depreciation:Fixed_Assets + Depreciation:Gross_Profit + Depreciation:Income_Tax + Depreciation:Net_Cash_Flow_Operating + Depreciation:Net_Income + Depreciation:Net_Income_Applicable_to_Common_Shareholders + Depreciation:Net_Income_Cont_Operations + Depreciation:Net_Receivables + Depreciation:Operating_Income + Depreciation:Other_Current_Assets + Depreciation:Other_Liabilities + Depreciation:Retained_Earnings + Depreciation:Sales_General_and_Admin + Depreciation:Total_Assets + Depreciation:Total_Current_Assets + Depreciation:Total_Current_Liabilities + Depreciation:Total_Equity + Depreciation:Total_Liabilities + Depreciation:Total_Liabilities_and_Equity + Depreciation:Total_Revenue + Earnings_Before_Interest_and_Tax:Earnings_Before_Tax + Earnings_Before_Interest_and_Tax:Fixed_Assets + Earnings_Before_Interest_and_Tax:Gross_Profit + Earnings_Before_Interest_and_Tax:Income_Tax + Earnings_Before_Interest_and_Tax:Net_Cash_Flow_Operating + Earnings_Before_Interest_and_Tax:Net_Income + Earnings_Before_Interest_and_Tax:Net_Income_Applicable_to_Common_Shareholders + Earnings_Before_Interest_and_Tax:Net_Income_Cont_Operations + Earnings_Before_Interest_and_Tax:Net_Receivables + Earnings_Before_Interest_and_Tax:Operating_Income + Earnings_Before_Interest_and_Tax:Other_Current_Assets + Earnings_Before_Interest_and_Tax:Other_Liabilities + Earnings_Before_Interest_and_Tax:Retained_Earnings + Earnings_Before_Interest_and_Tax:Sales_General_and_Admin + Earnings_Before_Interest_and_Tax:Total_Assets + Earnings_Before_Interest_and_Tax:Total_Current_Assets + Earnings_Before_Interest_and_Tax:Total_Current_Liabilities + Earnings_Before_Interest_and_Tax:Total_Equity + Earnings_Before_Interest_and_Tax:Total_Liabilities + Earnings_Before_Interest_and_Tax:Total_Liabilities_and_Equity + Earnings_Before_Interest_and_Tax:Total_Revenue + Earnings_Before_Tax:Fixed_Assets + Earnings_Before_Tax:Gross_Profit + Earnings_Before_Tax:Income_Tax + Earnings_Before_Tax:Net_Cash_Flow_Operating + Earnings_Before_Tax:Net_Income + Earnings_Before_Tax:Net_Income_Applicable_to_Common_Shareholders + Earnings_Before_Tax:Net_Income_Cont_Operations + Earnings_Before_Tax:Net_Receivables + Earnings_Before_Tax:Operating_Income + Earnings_Before_Tax:Other_Current_Assets + Earnings_Before_Tax:Other_Liabilities + Earnings_Before_Tax:Retained_Earnings + Earnings_Before_Tax:Sales_General_and_Admin + Earnings_Before_Tax:Total_Assets + Earnings_Before_Tax:Total_Current_Assets + Earnings_Before_Tax:Total_Current_Liabilities + Earnings_Before_Tax:Total_Equity + Earnings_Before_Tax:Total_Liabilities + Earnings_Before_Tax:Total_Liabilities_and_Equity + Earnings_Before_Tax:Total_Revenue + Fixed_Assets:Gross_Profit + Fixed_Assets:Income_Tax + Fixed_Assets:Net_Cash_Flow_Operating + Fixed_Assets:Net_Income + Fixed_Assets:Net_Income_Applicable_to_Common_Shareholders + Fixed_Assets:Net_Income_Cont_Operations + Fixed_Assets:Net_Receivables + Fixed_Assets:Operating_Income + Fixed_Assets:Other_Current_Assets + Fixed_Assets:Other_Liabilities + Fixed_Assets:Retained_Earnings + Fixed_Assets:Sales_General_and_Admin + Fixed_Assets:Total_Assets + Fixed_Assets:Total_Current_Assets + Fixed_Assets:Total_Current_Liabilities + Fixed_Assets:Total_Equity + Fixed_Assets:Total_Liabilities + Fixed_Assets:Total_Liabilities_and_Equity + Fixed_Assets:Total_Revenue + Gross_Profit:Income_Tax + Gross_Profit:Net_Cash_Flow_Operating + Gross_Profit:Net_Income + Gross_Profit:Net_Income_Applicable_to_Common_Shareholders + Gross_Profit:Net_Income_Cont_Operations + Gross_Profit:Net_Receivables + Gross_Profit:Operating_Income + Gross_Profit:Other_Current_Assets + Gross_Profit:Other_Liabilities + Gross_Profit:Retained_Earnings + Gross_Profit:Sales_General_and_Admin + Gross_Profit:Total_Assets + Gross_Profit:Total_Current_Assets + Gross_Profit:Total_Current_Liabilities + Gross_Profit:Total_Equity + Gross_Profit:Total_Liabilities + Gross_Profit:Total_Liabilities_and_Equity + Gross_Profit:Total_Revenue + Income_Tax:Net_Cash_Flow_Operating + Income_Tax:Net_Income + Income_Tax:Net_Income_Applicable_to_Common_Shareholders + Income_Tax:Net_Income_Cont_Operations + Income_Tax:Net_Receivables + Income_Tax:Operating_Income + Income_Tax:Other_Current_Assets + Income_Tax:Other_Liabilities + Income_Tax:Retained_Earnings + Income_Tax:Sales_General_and_Admin + Income_Tax:Total_Assets + Income_Tax:Total_Current_Assets + Income_Tax:Total_Current_Liabilities + Income_Tax:Total_Equity + Income_Tax:Total_Liabilities + Income_Tax:Total_Liabilities_and_Equity + Income_Tax:Total_Revenue + Net_Cash_Flow_Operating:Net_Income + Net_Cash_Flow_Operating:Net_Income_Applicable_to_Common_Shareholders + Net_Cash_Flow_Operating:Net_Income_Cont_Operations + Net_Cash_Flow_Operating:Net_Receivables + Net_Cash_Flow_Operating:Operating_Income + Net_Cash_Flow_Operating:Other_Current_Assets + Net_Cash_Flow_Operating:Other_Liabilities + Net_Cash_Flow_Operating:Retained_Earnings + Net_Cash_Flow_Operating:Sales_General_and_Admin + Net_Cash_Flow_Operating:Total_Assets + Net_Cash_Flow_Operating:Total_Current_Assets + Net_Cash_Flow_Operating:Total_Current_Liabilities + Net_Cash_Flow_Operating:Total_Equity + Net_Cash_Flow_Operating:Total_Liabilities + Net_Cash_Flow_Operating:Total_Liabilities_and_Equity + Net_Cash_Flow_Operating:Total_Revenue + Net_Income:Net_Income_Applicable_to_Common_Shareholders + Net_Income:Net_Income_Cont_Operations + Net_Income:Net_Receivables + Net_Income:Operating_Income + Net_Income:Other_Current_Assets + Net_Income:Other_Liabilities + Net_Income:Retained_Earnings + Net_Income:Sales_General_and_Admin + Net_Income:Total_Assets + Net_Income:Total_Current_Assets + Net_Income:Total_Current_Liabilities + Net_Income:Total_Equity + Net_Income:Total_Liabilities + Net_Income:Total_Liabilities_and_Equity + Net_Income:Total_Revenue + Net_Income_Applicable_to_Common_Shareholders:Net_Income_Cont_Operations + Net_Income_Applicable_to_Common_Shareholders:Net_Receivables + Net_Income_Applicable_to_Common_Shareholders:Operating_Income + Net_Income_Applicable_to_Common_Shareholders:Other_Current_Assets + Net_Income_Applicable_to_Common_Shareholders:Other_Liabilities + Net_Income_Applicable_to_Common_Shareholders:Retained_Earnings + Net_Income_Applicable_to_Common_Shareholders:Sales_General_and_Admin + Net_Income_Applicable_to_Common_Shareholders:Total_Assets + Net_Income_Applicable_to_Common_Shareholders:Total_Current_Assets + Net_Income_Applicable_to_Common_Shareholders:Total_Current_Liabilities + Net_Income_Applicable_to_Common_Shareholders:Total_Equity + Net_Income_Applicable_to_Common_Shareholders:Total_Liabilities + Net_Income_Applicable_to_Common_Shareholders:Total_Liabilities_and_Equity + Net_Income_Applicable_to_Common_Shareholders:Total_Revenue + Net_Income_Cont_Operations:Net_Receivables + Net_Income_Cont_Operations:Operating_Income + Net_Income_Cont_Operations:Other_Current_Assets + Net_Income_Cont_Operations:Other_Liabilities + Net_Income_Cont_Operations:Retained_Earnings + Net_Income_Cont_Operations:Sales_General_and_Admin + Net_Income_Cont_Operations:Total_Assets + Net_Income_Cont_Operations:Total_Current_Assets + Net_Income_Cont_Operations:Total_Current_Liabilities + Net_Income_Cont_Operations:Total_Equity + Net_Income_Cont_Operations:Total_Liabilities + Net_Income_Cont_Operations:Total_Liabilities_and_Equity + Net_Income_Cont_Operations:Total_Revenue + Net_Receivables:Operating_Income + Net_Receivables:Other_Current_Assets + Net_Receivables:Other_Liabilities + Net_Receivables:Retained_Earnings + Net_Receivables:Sales_General_and_Admin + Net_Receivables:Total_Assets + Net_Receivables:Total_Current_Assets + Net_Receivables:Total_Current_Liabilities + Net_Receivables:Total_Equity + Net_Receivables:Total_Liabilities + Net_Receivables:Total_Liabilities_and_Equity + Net_Receivables:Total_Revenue + Operating_Income:Other_Current_Assets + Operating_Income:Other_Liabilities + Operating_Income:Retained_Earnings + Operating_Income:Sales_General_and_Admin + Operating_Income:Total_Assets + Operating_Income:Total_Current_Assets + Operating_Income:Total_Current_Liabilities + Operating_Income:Total_Equity + Operating_Income:Total_Liabilities + Operating_Income:Total_Liabilities_and_Equity + Operating_Income:Total_Revenue + Other_Current_Assets:Other_Liabilities + Other_Current_Assets:Retained_Earnings + Other_Current_Assets:Sales_General_and_Admin + Other_Current_Assets:Total_Assets + Other_Current_Assets:Total_Current_Assets + Other_Current_Assets:Total_Current_Liabilities + Other_Current_Assets:Total_Equity + Other_Current_Assets:Total_Liabilities + Other_Current_Assets:Total_Liabilities_and_Equity + Other_Current_Assets:Total_Revenue + Other_Liabilities:Retained_Earnings + Other_Liabilities:Sales_General_and_Admin + Other_Liabilities:Total_Assets + Other_Liabilities:Total_Current_Assets + Other_Liabilities:Total_Current_Liabilities + Other_Liabilities:Total_Equity + Other_Liabilities:Total_Liabilities + Other_Liabilities:Total_Liabilities_and_Equity + Other_Liabilities:Total_Revenue + Retained_Earnings:Sales_General_and_Admin + Retained_Earnings:Total_Assets + Retained_Earnings:Total_Current_Assets + Retained_Earnings:Total_Current_Liabilities + Retained_Earnings:Total_Equity + Retained_Earnings:Total_Liabilities + Retained_Earnings:Total_Liabilities_and_Equity + Retained_Earnings:Total_Revenue + Sales_General_and_Admin:Total_Assets + Sales_General_and_Admin:Total_Current_Assets + Sales_General_and_Admin:Total_Current_Liabilities + Sales_General_and_Admin:Total_Equity + Sales_General_and_Admin:Total_Liabilities + Sales_General_and_Admin:Total_Liabilities_and_Equity + Sales_General_and_Admin:Total_Revenue + Total_Assets:Total_Current_Assets + Total_Assets:Total_Current_Liabilities + Total_Assets:Total_Equity + Total_Assets:Total_Liabilities + Total_Assets:Total_Liabilities_and_Equity + Total_Assets:Total_Revenue + Total_Current_Assets:Total_Current_Liabilities + Total_Current_Assets:Total_Equity + Total_Current_Assets:Total_Liabilities + Total_Current_Assets:Total_Liabilities_and_Equity + Total_Current_Assets:Total_Revenue + Total_Current_Liabilities:Total_Equity + Total_Current_Liabilities:Total_Liabilities + Total_Current_Liabilities:Total_Liabilities_and_Equity + Total_Current_Liabilities:Total_Revenue + Total_Equity:Total_Liabilities + Total_Equity:Total_Liabilities_and_Equity + Total_Equity:Total_Revenue + Total_Liabilities:Total_Liabilities_and_Equity + Total_Liabilities:Total_Revenue + Total_Liabilities_and_Equity:Total_Revenue'

Multicollinearity in Linear Regression:¶

Explain how multicollinearity can affect the interpretation of a linear regression model's coefficients.¶

Multicollinearity basically refers to phenomenon where two predictor variables are highly correlated with each other. To check if our variables have multicollinearity, we can try to look at the heat map of correation dataframe.

In [80]:
sns.heatmap(corr_df)
Out[80]:
<Axes: >
  • Here we can see that a lot of predictor variables also are very highly correlated with each other, which is a strong sign for presence of multicollinearity.

  • Presence of Multicollinearity can make our coefficients unstable - which means that small change in our data might change the coefficients by a lot.

  • It also can lead to higher standard errors and inaccurate p-values leading to higher False Discovery Rate.

P-Value Analysis and Histogram¶

In [81]:
# For Model50 (50 Variable model)

plt.figure(figsize=(5,5))
p_values50 = model50.pvalues
plt.hist(p_values50, bins=100, edgecolor='black')
plt.show()

Create a histogram of the p-values. Is there any skewedness? Provide your explanation¶

  • We can see a certain skewness towards the p-value of 0.

  • The null hypothesis here is the distribution is uniform. But visually they don't look uniform and to prove that we will perform a Perform Kolmogorov-Smirnov test

In [82]:
## Kolmogorov-Smirnov test

ks_statistic, ks_pvalue = kstest(p_values50, 'uniform')

ks_statistic, ks_pvalue
Out[82]:
(0.568744265519529, 4.025640433853716e-16)
  • The very small p-value (4.025640433853716e-16) says that we have to reject our null hypothesis and can say that the distribution is not uniform

  • The skewness is due to the presence of true predictor variables which has signal.

In [83]:
# For Model25 (25 Variable model)

p_values25 = model25.pvalues
plt.figure(figsize=(3,3))
plt.hist(p_values25, bins=10, edgecolor='black')
plt.show()

# We can see the same skewnewss in the smaller model as well. 

False Discovery Rate Control with BH Procedure¶

Given the p values you find, use the BH procedure to control the FDR with a q of 0.1¶

  • We will do the False rate discovery with p_value50 with the BH proceduere. This model has 410 variables (50 variable model along with dummy variables) with different p_values which we want to control the FDR for, and also we will check for the alpha value
In [84]:
### writing a function for FDR calculation and plot

def fdr(pvals, q, plotit=False):
  # Remove NA values
  pvals = np.array(pvals)
  pvals = pvals[~np.isnan(pvals)]
  N = len(pvals)

  # Sort the p-values and calculate the FDR threshold
  sorted_pvals = np.sort(pvals)
  k = np.arange(1, N+1)
  fdr_threshold = (q * k) / N

  # Find the last p-value that is below the FDR threshold
  below_threshold = sorted_pvals <= fdr_threshold
  max_index = np.max(np.where(below_threshold)) if np.any(below_threshold) else 0
  alpha = sorted_pvals[max_index]
  print(f"Alpha: {alpha}")

  # Optional plot
  if plotit:
    plt.figure(figsize=(5,5))
    plt.scatter(range(N), sorted_pvals, c=np.where(sorted_pvals <= alpha, 'red', 'grey'), marker='o')
    plt.yscale('log')
    plt.plot(range(N), fdr_threshold, linestyle='--', color='blue')
    plt.xlabel("Tests ordered by p-value")
    plt.ylabel("p-values")
    plt.title(f"FDR = {q}")
    plt.show()

  return alpha, below_threshold.sum()

FDR For model50¶

In [85]:
fdr(p_values50,0.1, plotit=True)
Alpha: 0.043306657340466424
Out[85]:
(0.043306657340466424, 23)

How many True discoveries do you estimate¶

  • There are 23 variables whose p-values are below the FDR threshold. With a FDR of 0.1 we can say that 90% of 23 ~= 21 are true discoveries from the 23 variables.

FDR For model25¶

In [86]:
fdr(p_values25,0.1, plotit=True)
Alpha: 0.02822540608114867
Out[86]:
(0.02822540608114867, 11)

How many True discoveries do you estimate¶

  • There are 11 variables whose p-values are below the FDR threshold. With a FDR of 0.1 we can say that 90% of 11 ~= 10 are true discoveries from the 11 variables.

Sensitivity Analysis of FDR Control:¶

If you apply the BH procedure at different q values, how do the results change?¶

Writing a loop to run FDR on different q values and then compare Discoveries and True Discoveries for different q values

In [87]:
q_vals = [0.01,0.05,0.1, 0.2,0.3,0.4, 0.5, 0.6,0.7,0.8,0.9, 0.95, 0.99]
q = []

for i in q_vals:
    result = fdr(p_values50,i)

    q.append((i, result[0], result[1]))
Alpha: 0.0016752953497703353
Alpha: 0.01747291405674574
Alpha: 0.043306657340466424
Alpha: 0.13713808742164754
Alpha: 0.16668617080262743
Alpha: 0.2680558626508672
Alpha: 0.40665259754033567
Alpha: 0.40665259754033567
Alpha: 0.5998773925777956
Alpha: 0.5998773925777956
Alpha: 0.5998773925777956
Alpha: 0.8412045133002687
Alpha: 0.9690443370743197
In [88]:
q_df = pd.DataFrame(q, columns=['q_value','threshold_p_value', 'discoveries'])
q_df['True_discoveries'] = q_df['discoveries']* (1-q_df['q_value'])
In [89]:
q_df.head(3)
Out[89]:
q_value threshold_p_value discoveries True_discoveries
0 0.01 0.001675 12 11.88
1 0.05 0.017473 18 17.10
2 0.10 0.043307 23 20.70
In [90]:
sns.lineplot(x=q_df['q_value'], y = q_df['True_discoveries'], label = 'True_discoveries')
sns.lineplot(x=q_df['q_value'], y = q_df['discoveries'], label = 'discoveries')
plt.show()

Observations¶

  • Here We can see that as we increase the q-value, our threshold p-value for variable discovery increases.. which leads to more significant variable discovery (orange line), but as we can see in the blue line, the True discoveries keep falling as we increase the q-value
  • This corroborates with the theory that as we increase our False Discovery Rate we tend to have lesser and lesser faith in the variables we discover.

Exploring Interaction Terms¶

Note - instead of taking the first 25 variables, we are taking the best 25 variables which we prepared during regression model building (model25 in the 2nd question)

In [91]:
df_final.head(3)
Out[91]:
Accounts_Payable Cash_and_Cash_Equivalents Depreciation Earnings_Before_Interest_and_Tax Earnings_Before_Tax Fixed_Assets Gross_Profit Income_Tax Net_Cash_Flow_Operating Net_Income Net_Income_Applicable_to_Common_Shareholders Net_Income_Cont_Operations Net_Receivables Operating_Income Other_Current_Assets Other_Liabilities Retained_Earnings Sales_General_and_Admin Total_Assets Total_Current_Assets Total_Current_Liabilities Total_Equity Total_Liabilities Total_Liabilities_and_Equity Total_Revenue Estimated_Shares_Outstanding
0 3.068000e+09 1.330000e+09 1.001000e+09 -1.813000e+09 -2.445000e+09 1.340200e+10 1.435600e+10 -569000000.0 1.285000e+09 -1.876000e+09 -1.876000e+09 -4.084000e+09 1.124000e+09 1.480000e+08 6.260000e+08 1.514700e+10 -9.462000e+09 1.297700e+10 2.351000e+10 7.072000e+09 9.011000e+09 -7.987000e+09 2.489100e+10 1.690400e+10 2.485500e+10 3.350000e+08
1 4.975000e+09 2.175000e+09 1.020000e+09 -1.324000e+09 -2.180000e+09 1.925900e+10 1.572400e+10 -346000000.0 6.750000e+08 -1.834000e+09 -1.834000e+09 -4.489000e+09 1.560000e+09 1.399000e+09 1.465000e+09 1.491500e+10 -1.129600e+10 1.291300e+10 4.227800e+10 1.432300e+10 1.380600e+10 -2.731000e+09 4.500900e+10 4.227800e+10 2.674300e+10 1.630222e+08
2 4.668000e+09 1.768000e+09 1.342000e+09 4.099000e+09 3.212000e+09 2.308400e+10 2.703000e+10 330000000.0 3.080000e+09 2.882000e+09 2.882000e+09 2.882000e+09 1.771000e+09 4.249000e+09 8.980000e+08 1.092800e+10 -8.562000e+09 2.068600e+10 4.322500e+10 1.175000e+10 1.340400e+10 2.021000e+09 4.120400e+10 4.322500e+10 4.265000e+10 7.169154e+08
In [105]:
len(interaction_terms)
Out[105]:
300
  • building quadratic interaction terms for each of the variables
  • include origial predictors, its squares and then interaction terms
In [109]:
len(interaction_terms)
Out[109]:
300
In [92]:
# building the interaction terms and fitting the model

columns = [col for col in df_final.columns if col != 'Estimated_Shares_Outstanding']
quadratic_terms = [f'{col}**2' for col in columns]
interaction_terms = [f'{col1}:{col2}' for col1, col2 in itertools.combinations(columns, 2)]
all_terms = columns + quadratic_terms + interaction_terms
formula = 'Estimated_Shares_Outstanding ~ ' + ' + '.join(all_terms)

model25_interaction = smf.ols(formula=formula, data = df_final).fit()

model25_interaction.summary()
Out[92]:
OLS Regression Results
Dep. Variable: Estimated_Shares_Outstanding R-squared: 0.940
Model: OLS Adj. R-squared: 0.924
Method: Least Squares F-statistic: 57.66
Date: Thu, 18 Jan 2024 Prob (F-statistic): 0.00
Time: 14:52:25 Log-Likelihood: -26926.
No. Observations: 1299 AIC: 5.441e+04
Df Residuals: 1019 BIC: 5.586e+04
Df Model: 279
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 0.0003 0.000 3.129 0.002 0.000 0.001
Accounts_Payable 0.0143 0.025 0.565 0.572 -0.035 0.064
Cash_and_Cash_Equivalents 0.0348 0.023 1.499 0.134 -0.011 0.080
Depreciation -0.0783 0.096 -0.819 0.413 -0.266 0.109
Earnings_Before_Interest_and_Tax -0.0315 0.028 -1.121 0.263 -0.087 0.024
Earnings_Before_Tax -0.0410 0.035 -1.174 0.241 -0.110 0.028
Fixed_Assets -0.0046 0.006 -0.793 0.428 -0.016 0.007
Gross_Profit 0.1045 0.031 3.354 0.001 0.043 0.166
Income_Tax 0.0228 0.070 0.325 0.745 -0.115 0.160
Net_Cash_Flow_Operating 0.0051 0.038 0.133 0.894 -0.070 0.080
Net_Income -0.0477 0.038 -1.247 0.213 -0.123 0.027
Net_Income_Applicable_to_Common_Shareholders -0.0485 0.039 -1.259 0.208 -0.124 0.027
Net_Income_Cont_Operations -0.0932 0.069 -1.359 0.175 -0.228 0.041
Net_Receivables -0.0406 0.022 -1.846 0.065 -0.084 0.003
Operating_Income 0.1189 0.085 1.398 0.162 -0.048 0.286
Other_Current_Assets -0.0045 0.051 -0.088 0.930 -0.105 0.096
Other_Liabilities 0.0078 0.021 0.369 0.712 -0.034 0.049
Retained_Earnings -0.0082 0.004 -2.068 0.039 -0.016 -0.000
Sales_General_and_Admin -0.0655 0.031 -2.141 0.033 -0.125 -0.005
Total_Assets 0.0108 0.002 4.950 0.000 0.007 0.015
Total_Current_Assets -0.0083 0.015 -0.557 0.578 -0.037 0.021
Total_Current_Liabilities 0.0055 0.021 0.267 0.789 -0.035 0.046
Total_Equity 0.0252 0.005 4.888 0.000 0.015 0.035
Total_Liabilities -0.0145 0.005 -2.952 0.003 -0.024 -0.005
Total_Liabilities_and_Equity 0.0106 0.002 4.915 0.000 0.006 0.015
Total_Revenue 0.0017 0.003 0.505 0.614 -0.005 0.008
Accounts_Payable:Cash_and_Cash_Equivalents 2.832e-11 9.07e-12 3.122 0.002 1.05e-11 4.61e-11
Accounts_Payable:Depreciation -1.62e-11 2.92e-11 -0.555 0.579 -7.35e-11 4.11e-11
Accounts_Payable:Earnings_Before_Interest_and_Tax -3.491e-11 9.22e-11 -0.379 0.705 -2.16e-10 1.46e-10
Accounts_Payable:Earnings_Before_Tax -7.853e-11 1.59e-10 -0.493 0.622 -3.91e-10 2.34e-10
Accounts_Payable:Fixed_Assets 3.078e-12 2.77e-12 1.113 0.266 -2.35e-12 8.5e-12
Accounts_Payable:Gross_Profit -1.372e-11 1.03e-11 -1.328 0.184 -3.4e-11 6.55e-12
Accounts_Payable:Income_Tax 2.161e-11 1.2e-10 0.180 0.857 -2.13e-10 2.57e-10
Accounts_Payable:Net_Cash_Flow_Operating 6.944e-12 1.21e-11 0.575 0.565 -1.67e-11 3.06e-11
Accounts_Payable:Net_Income 1.131e-10 2.12e-09 0.053 0.957 -4.04e-09 4.27e-09
Accounts_Payable:Net_Income_Applicable_to_Common_Shareholders -1.894e-10 2.12e-09 -0.089 0.929 -4.35e-09 3.97e-09
Accounts_Payable:Net_Income_Cont_Operations 6.674e-11 1e-10 0.665 0.506 -1.3e-10 2.64e-10
Accounts_Payable:Net_Receivables -2.999e-11 5.17e-12 -5.804 0.000 -4.01e-11 -1.98e-11
Accounts_Payable:Operating_Income 1.23e-10 4.62e-11 2.662 0.008 3.23e-11 2.14e-10
Accounts_Payable:Other_Current_Assets -3.348e-11 1.42e-11 -2.357 0.019 -6.14e-11 -5.6e-12
Accounts_Payable:Other_Liabilities 2.407e-11 7.67e-12 3.139 0.002 9.02e-12 3.91e-11
Accounts_Payable:Retained_Earnings -9.541e-13 2.02e-12 -0.473 0.637 -4.92e-12 3.01e-12
Accounts_Payable:Sales_General_and_Admin 1.228e-11 1.12e-11 1.098 0.272 -9.66e-12 3.42e-11
Accounts_Payable:Total_Assets 5.421e-10 6.97e-10 0.778 0.437 -8.26e-10 1.91e-09
Accounts_Payable:Total_Current_Assets 9.341e-13 4.78e-12 0.195 0.845 -8.44e-12 1.03e-11
Accounts_Payable:Total_Current_Liabilities 8.015e-12 4.35e-12 1.842 0.066 -5.22e-13 1.66e-11
Accounts_Payable:Total_Equity 2.522e-06 1.24e-05 0.204 0.839 -2.18e-05 2.68e-05
Accounts_Payable:Total_Liabilities 2.522e-06 1.24e-05 0.204 0.839 -2.18e-05 2.68e-05
Accounts_Payable:Total_Liabilities_and_Equity -2.523e-06 1.24e-05 -0.204 0.839 -2.68e-05 2.18e-05
Accounts_Payable:Total_Revenue -5.255e-13 8.94e-13 -0.588 0.557 -2.28e-12 1.23e-12
Cash_and_Cash_Equivalents:Depreciation 1.586e-11 2.59e-11 0.613 0.540 -3.49e-11 6.66e-11
Cash_and_Cash_Equivalents:Earnings_Before_Interest_and_Tax 3.691e-10 8.19e-11 4.504 0.000 2.08e-10 5.3e-10
Cash_and_Cash_Equivalents:Earnings_Before_Tax -6.868e-10 1.52e-10 -4.524 0.000 -9.85e-10 -3.89e-10
Cash_and_Cash_Equivalents:Fixed_Assets -6.136e-12 3.06e-12 -2.008 0.045 -1.21e-11 -1.39e-13
Cash_and_Cash_Equivalents:Gross_Profit -2.276e-12 1.01e-11 -0.225 0.822 -2.21e-11 1.76e-11
Cash_and_Cash_Equivalents:Income_Tax 2.423e-10 1.22e-10 1.989 0.047 3.31e-12 4.81e-10
Cash_and_Cash_Equivalents:Net_Cash_Flow_Operating 4.794e-11 1.86e-11 2.584 0.010 1.15e-11 8.43e-11
Cash_and_Cash_Equivalents:Net_Income 4.84e-09 1.91e-09 2.531 0.012 1.09e-09 8.59e-09
Cash_and_Cash_Equivalents:Net_Income_Applicable_to_Common_Shareholders -4.87e-09 1.91e-09 -2.544 0.011 -8.63e-09 -1.11e-09
Cash_and_Cash_Equivalents:Net_Income_Cont_Operations 1.89e-10 1.14e-10 1.652 0.099 -3.55e-11 4.14e-10
Cash_and_Cash_Equivalents:Net_Receivables -9.385e-12 8.72e-12 -1.077 0.282 -2.65e-11 7.72e-12
Cash_and_Cash_Equivalents:Operating_Income 1.118e-10 3.98e-11 2.809 0.005 3.37e-11 1.9e-10
Cash_and_Cash_Equivalents:Other_Current_Assets -6.081e-11 1.89e-11 -3.213 0.001 -9.79e-11 -2.37e-11
Cash_and_Cash_Equivalents:Other_Liabilities 1.082e-12 8.87e-12 0.122 0.903 -1.63e-11 1.85e-11
Cash_and_Cash_Equivalents:Retained_Earnings 1.416e-12 1.78e-12 0.794 0.427 -2.08e-12 4.92e-12
Cash_and_Cash_Equivalents:Sales_General_and_Admin -4.169e-12 1.02e-11 -0.409 0.682 -2.42e-11 1.58e-11
Cash_and_Cash_Equivalents:Total_Assets 1.551e-09 4.29e-09 0.362 0.718 -6.87e-09 9.97e-09
Cash_and_Cash_Equivalents:Total_Current_Assets 1.081e-11 3.66e-12 2.953 0.003 3.63e-12 1.8e-11
Cash_and_Cash_Equivalents:Total_Current_Liabilities -1.238e-12 7.16e-12 -0.173 0.863 -1.53e-11 1.28e-11
Cash_and_Cash_Equivalents:Total_Equity 2.738e-06 3.88e-06 0.706 0.480 -4.87e-06 1.03e-05
Cash_and_Cash_Equivalents:Total_Liabilities 2.738e-06 3.88e-06 0.706 0.480 -4.87e-06 1.03e-05
Cash_and_Cash_Equivalents:Total_Liabilities_and_Equity -2.74e-06 3.88e-06 -0.707 0.480 -1.03e-05 4.87e-06
Cash_and_Cash_Equivalents:Total_Revenue -2.488e-12 1.45e-12 -1.713 0.087 -5.34e-12 3.61e-13
Depreciation:Earnings_Before_Interest_and_Tax 6.397e-10 1.98e-10 3.235 0.001 2.52e-10 1.03e-09
Depreciation:Earnings_Before_Tax -8.458e-10 2.76e-10 -3.064 0.002 -1.39e-09 -3.04e-10
Depreciation:Fixed_Assets -2.164e-11 6.04e-12 -3.585 0.000 -3.35e-11 -9.79e-12
Depreciation:Gross_Profit 1.182e-10 2.2e-11 5.371 0.000 7.5e-11 1.61e-10
Depreciation:Income_Tax 3.604e-10 2.46e-10 1.466 0.143 -1.22e-10 8.43e-10
Depreciation:Net_Cash_Flow_Operating -2.234e-11 2.68e-11 -0.834 0.405 -7.49e-11 3.02e-11
Depreciation:Net_Income 4.118e-10 3.6e-09 0.114 0.909 -6.65e-09 7.48e-09
Depreciation:Net_Income_Applicable_to_Common_Shareholders 9.33e-11 3.62e-09 0.026 0.979 -7.01e-09 7.2e-09
Depreciation:Net_Income_Cont_Operations -2.958e-10 2.33e-10 -1.270 0.204 -7.53e-10 1.61e-10
Depreciation:Net_Receivables 1.175e-10 3.35e-11 3.506 0.000 5.17e-11 1.83e-10
Depreciation:Operating_Income -1.139e-11 8.42e-11 -0.135 0.892 -1.77e-10 1.54e-10
Depreciation:Other_Current_Assets 1.054e-10 5.17e-11 2.040 0.042 4.01e-12 2.07e-10
Depreciation:Other_Liabilities -6.349e-12 2.41e-11 -0.263 0.793 -5.37e-11 4.1e-11
Depreciation:Retained_Earnings -2.218e-11 4.97e-12 -4.467 0.000 -3.19e-11 -1.24e-11
Depreciation:Sales_General_and_Admin -1.48e-10 2.73e-11 -5.428 0.000 -2.01e-10 -9.45e-11
Depreciation:Total_Assets 4.249e-09 5.36e-09 0.792 0.428 -6.27e-09 1.48e-08
Depreciation:Total_Current_Assets -2.454e-11 2.02e-11 -1.216 0.224 -6.41e-11 1.51e-11
Depreciation:Total_Current_Liabilities -1.28e-11 2.76e-11 -0.464 0.643 -6.69e-11 4.13e-11
Depreciation:Total_Equity -7.348e-06 1.1e-05 -0.670 0.503 -2.89e-05 1.42e-05
Depreciation:Total_Liabilities -7.348e-06 1.1e-05 -0.670 0.503 -2.89e-05 1.42e-05
Depreciation:Total_Liabilities_and_Equity 7.344e-06 1.1e-05 0.670 0.503 -1.42e-05 2.89e-05
Depreciation:Total_Revenue -1.416e-12 6.12e-12 -0.231 0.817 -1.34e-11 1.06e-11
Earnings_Before_Interest_and_Tax:Earnings_Before_Tax 3.805e-10 2.28e-10 1.672 0.095 -6.6e-11 8.27e-10
Earnings_Before_Interest_and_Tax:Fixed_Assets 5.73e-11 1.88e-11 3.046 0.002 2.04e-11 9.42e-11
Earnings_Before_Interest_and_Tax:Gross_Profit -1.364e-10 1.03e-10 -1.328 0.185 -3.38e-10 6.52e-11
Earnings_Before_Interest_and_Tax:Income_Tax -1.991e-10 4.95e-10 -0.402 0.688 -1.17e-09 7.73e-10
Earnings_Before_Interest_and_Tax:Net_Cash_Flow_Operating -3.304e-10 1.24e-10 -2.658 0.008 -5.74e-10 -8.65e-11
Earnings_Before_Interest_and_Tax:Net_Income 1.266e-08 1.37e-08 0.921 0.357 -1.43e-08 3.96e-08
Earnings_Before_Interest_and_Tax:Net_Income_Applicable_to_Common_Shareholders -1.27e-08 1.37e-08 -0.924 0.356 -3.97e-08 1.43e-08
Earnings_Before_Interest_and_Tax:Net_Income_Cont_Operations 3.625e-10 4.21e-10 0.860 0.390 -4.64e-10 1.19e-09
Earnings_Before_Interest_and_Tax:Net_Receivables 3.281e-10 6.9e-11 4.757 0.000 1.93e-10 4.63e-10
Earnings_Before_Interest_and_Tax:Operating_Income 1.055e-10 2.75e-10 0.384 0.701 -4.34e-10 6.45e-10
Earnings_Before_Interest_and_Tax:Other_Current_Assets -1.096e-10 1.56e-10 -0.704 0.481 -4.15e-10 1.96e-10
Earnings_Before_Interest_and_Tax:Other_Liabilities -7.8e-11 7.41e-11 -1.053 0.293 -2.23e-10 6.74e-11
Earnings_Before_Interest_and_Tax:Retained_Earnings -2.938e-11 1.5e-11 -1.953 0.051 -5.89e-11 1.45e-13
Earnings_Before_Interest_and_Tax:Sales_General_and_Admin 8.345e-11 1.09e-10 0.767 0.443 -1.3e-10 2.97e-10
Earnings_Before_Interest_and_Tax:Total_Assets 2.345e-08 1.12e-08 2.102 0.036 1.56e-09 4.53e-08
Earnings_Before_Interest_and_Tax:Total_Current_Assets -1.251e-10 4.69e-11 -2.666 0.008 -2.17e-10 -3.3e-11
Earnings_Before_Interest_and_Tax:Total_Current_Liabilities 1.64e-10 7.29e-11 2.249 0.025 2.09e-11 3.07e-10
Earnings_Before_Interest_and_Tax:Total_Equity -1.889e-05 2.05e-05 -0.919 0.358 -5.92e-05 2.14e-05
Earnings_Before_Interest_and_Tax:Total_Liabilities -1.889e-05 2.05e-05 -0.919 0.358 -5.92e-05 2.14e-05
Earnings_Before_Interest_and_Tax:Total_Liabilities_and_Equity 1.887e-05 2.06e-05 0.918 0.359 -2.15e-05 5.92e-05
Earnings_Before_Interest_and_Tax:Total_Revenue -1.59e-11 1.13e-11 -1.410 0.159 -3.8e-11 6.23e-12
Earnings_Before_Tax:Fixed_Assets 2.542e-11 2.67e-11 0.952 0.342 -2.7e-11 7.79e-11
Earnings_Before_Tax:Gross_Profit -1.989e-10 1.76e-10 -1.128 0.260 -5.45e-10 1.47e-10
Earnings_Before_Tax:Income_Tax -1.333e-10 3.91e-10 -0.341 0.733 -9e-10 6.33e-10
Earnings_Before_Tax:Net_Cash_Flow_Operating 4.945e-10 1.54e-10 3.208 0.001 1.92e-10 7.97e-10
Earnings_Before_Tax:Net_Income -3.741e-09 1.66e-08 -0.225 0.822 -3.64e-08 2.89e-08
Earnings_Before_Tax:Net_Income_Applicable_to_Common_Shareholders 3.373e-09 1.66e-08 0.203 0.839 -2.93e-08 3.6e-08
Earnings_Before_Tax:Net_Income_Cont_Operations -4.634e-10 3.76e-10 -1.232 0.218 -1.2e-09 2.75e-10
Earnings_Before_Tax:Net_Receivables -7.608e-11 1.46e-10 -0.523 0.601 -3.62e-10 2.1e-10
Earnings_Before_Tax:Operating_Income -2.797e-10 3.59e-10 -0.779 0.436 -9.84e-10 4.25e-10
Earnings_Before_Tax:Other_Current_Assets 1.754e-11 3.16e-10 0.055 0.956 -6.03e-10 6.38e-10
Earnings_Before_Tax:Other_Liabilities 3.256e-10 1.09e-10 2.998 0.003 1.12e-10 5.39e-10
Earnings_Before_Tax:Retained_Earnings 2.34e-11 2.54e-11 0.921 0.357 -2.64e-11 7.32e-11
Earnings_Before_Tax:Sales_General_and_Admin 2.326e-10 2.04e-10 1.139 0.255 -1.68e-10 6.33e-10
Earnings_Before_Tax:Total_Assets 1.245e-08 2.79e-08 0.446 0.656 -4.24e-08 6.72e-08
Earnings_Before_Tax:Total_Current_Assets 5.253e-10 1.13e-10 4.648 0.000 3.04e-10 7.47e-10
Earnings_Before_Tax:Total_Current_Liabilities -4.02e-10 1.24e-10 -3.242 0.001 -6.45e-10 -1.59e-10
Earnings_Before_Tax:Total_Equity 1.525e-05 1.2e-05 1.268 0.205 -8.35e-06 3.89e-05
Earnings_Before_Tax:Total_Liabilities 1.525e-05 1.2e-05 1.268 0.205 -8.35e-06 3.89e-05
Earnings_Before_Tax:Total_Liabilities_and_Equity -1.527e-05 1.2e-05 -1.269 0.205 -3.89e-05 8.34e-06
Earnings_Before_Tax:Total_Revenue -6.285e-12 2.56e-11 -0.246 0.806 -5.65e-11 4.39e-11
Fixed_Assets:Gross_Profit -1.58e-11 3.9e-12 -4.049 0.000 -2.34e-11 -8.14e-12
Fixed_Assets:Income_Tax -6.091e-11 2.12e-11 -2.868 0.004 -1.03e-10 -1.92e-11
Fixed_Assets:Net_Cash_Flow_Operating 5.665e-12 3.66e-12 1.547 0.122 -1.52e-12 1.28e-11
Fixed_Assets:Net_Income 1.115e-09 3.61e-10 3.088 0.002 4.06e-10 1.82e-09
Fixed_Assets:Net_Income_Applicable_to_Common_Shareholders -1.122e-09 3.6e-10 -3.112 0.002 -1.83e-09 -4.14e-10
Fixed_Assets:Net_Income_Cont_Operations -4.908e-11 2.1e-11 -2.341 0.019 -9.02e-11 -7.95e-12
Fixed_Assets:Net_Receivables -7.898e-12 3.37e-12 -2.343 0.019 -1.45e-11 -1.28e-12
Fixed_Assets:Operating_Income -3.024e-11 1e-11 -3.010 0.003 -5e-11 -1.05e-11
Fixed_Assets:Other_Current_Assets -2.842e-12 5e-12 -0.569 0.570 -1.26e-11 6.96e-12
Fixed_Assets:Other_Liabilities 1.626e-12 1.79e-12 0.908 0.364 -1.89e-12 5.14e-12
Fixed_Assets:Retained_Earnings 2.852e-12 5.64e-13 5.059 0.000 1.75e-12 3.96e-12
Fixed_Assets:Sales_General_and_Admin 2.311e-11 4.37e-12 5.284 0.000 1.45e-11 3.17e-11
Fixed_Assets:Total_Assets -5.17e-10 5.16e-10 -1.001 0.317 -1.53e-09 4.96e-10
Fixed_Assets:Total_Current_Assets 7.32e-12 1.83e-12 4.006 0.000 3.73e-12 1.09e-11
Fixed_Assets:Total_Current_Liabilities -1.369e-11 2.33e-12 -5.865 0.000 -1.83e-11 -9.11e-12
Fixed_Assets:Total_Equity -1.156e-05 5.62e-05 -0.206 0.837 -0.000 9.88e-05
Fixed_Assets:Total_Liabilities -1.156e-05 5.62e-05 -0.206 0.837 -0.000 9.88e-05
Fixed_Assets:Total_Liabilities_and_Equity 1.156e-05 5.62e-05 0.206 0.837 -9.88e-05 0.000
Fixed_Assets:Total_Revenue 5.365e-13 3.87e-13 1.386 0.166 -2.23e-13 1.3e-12
Gross_Profit:Income_Tax 4.018e-10 1.57e-10 2.563 0.011 9.42e-11 7.09e-10
Gross_Profit:Net_Cash_Flow_Operating -3.127e-11 9.07e-12 -3.447 0.001 -4.91e-11 -1.35e-11
Gross_Profit:Net_Income -5.159e-10 2.36e-09 -0.219 0.827 -5.14e-09 4.11e-09
Gross_Profit:Net_Income_Applicable_to_Common_Shareholders 2.575e-10 2.36e-09 0.109 0.913 -4.38e-09 4.89e-09
Gross_Profit:Net_Income_Cont_Operations 5.194e-10 1.66e-10 3.126 0.002 1.93e-10 8.45e-10
Gross_Profit:Net_Receivables 8.538e-12 1.26e-11 0.678 0.498 -1.62e-11 3.32e-11
Gross_Profit:Operating_Income 4.956e-11 3.12e-11 1.590 0.112 -1.16e-11 1.11e-10
Gross_Profit:Other_Current_Assets -2.124e-11 2.35e-11 -0.905 0.366 -6.73e-11 2.48e-11
Gross_Profit:Other_Liabilities 1.962e-11 9.87e-12 1.988 0.047 2.55e-13 3.9e-11
Gross_Profit:Retained_Earnings 5.115e-12 2.12e-12 2.410 0.016 9.51e-13 9.28e-12
Gross_Profit:Sales_General_and_Admin -9.453e-13 1.28e-12 -0.739 0.460 -3.46e-12 1.57e-12
Gross_Profit:Total_Assets 6.584e-10 1.58e-09 0.416 0.678 -2.45e-09 3.77e-09
Gross_Profit:Total_Current_Assets 3.036e-12 6.57e-12 0.462 0.644 -9.85e-12 1.59e-11
Gross_Profit:Total_Current_Liabilities -5.617e-12 8.24e-12 -0.682 0.495 -2.18e-11 1.05e-11
Gross_Profit:Total_Equity 5.076e-05 3.16e-05 1.607 0.108 -1.12e-05 0.000
Gross_Profit:Total_Liabilities 5.076e-05 3.16e-05 1.607 0.108 -1.12e-05 0.000
Gross_Profit:Total_Liabilities_and_Equity -5.076e-05 3.16e-05 -1.607 0.108 -0.000 1.12e-05
Gross_Profit:Total_Revenue -1.924e-12 9.33e-13 -2.062 0.039 -3.75e-12 -9.31e-14
Income_Tax:Net_Cash_Flow_Operating -2.519e-10 1.26e-10 -2.001 0.046 -4.99e-10 -4.93e-12
Income_Tax:Net_Income -8.414e-09 1.63e-08 -0.515 0.607 -4.05e-08 2.37e-08
Income_Tax:Net_Income_Applicable_to_Common_Shareholders 8.757e-09 1.63e-08 0.537 0.592 -2.33e-08 4.08e-08
Income_Tax:Net_Income_Cont_Operations -2.491e-10 1.24e-10 -2.007 0.045 -4.93e-10 -5.53e-12
Income_Tax:Net_Receivables -2.457e-10 1.26e-10 -1.953 0.051 -4.92e-10 1.14e-12
Income_Tax:Operating_Income 1.978e-10 2.14e-10 0.925 0.355 -2.22e-10 6.18e-10
Income_Tax:Other_Current_Assets -8.122e-11 2.35e-10 -0.345 0.730 -5.43e-10 3.81e-10
Income_Tax:Other_Liabilities -1.105e-10 7.56e-11 -1.461 0.144 -2.59e-10 3.79e-11
Income_Tax:Retained_Earnings -2.229e-11 2.04e-11 -1.094 0.274 -6.23e-11 1.77e-11
Income_Tax:Sales_General_and_Admin -4.179e-10 1.76e-10 -2.381 0.017 -7.62e-10 -7.35e-11
Income_Tax:Total_Assets -2.302e-08 3.11e-08 -0.740 0.459 -8.41e-08 3.8e-08
Income_Tax:Total_Current_Assets -3.404e-10 9.74e-11 -3.493 0.000 -5.32e-10 -1.49e-10
Income_Tax:Total_Current_Liabilities 3.808e-10 1.1e-10 3.459 0.001 1.65e-10 5.97e-10
Income_Tax:Total_Equity 1.325e-05 8.65e-06 1.532 0.126 -3.72e-06 3.02e-05
Income_Tax:Total_Liabilities 1.325e-05 8.65e-06 1.532 0.126 -3.72e-06 3.02e-05
Income_Tax:Total_Liabilities_and_Equity -1.323e-05 8.65e-06 -1.530 0.126 -3.02e-05 3.74e-06
Income_Tax:Total_Revenue 1.763e-11 2.04e-11 0.865 0.387 -2.24e-11 5.76e-11
Net_Cash_Flow_Operating:Net_Income 4.179e-11 2.19e-09 0.019 0.985 -4.26e-09 4.35e-09
Net_Cash_Flow_Operating:Net_Income_Applicable_to_Common_Shareholders -2.816e-10 2.2e-09 -0.128 0.898 -4.6e-09 4.03e-09
Net_Cash_Flow_Operating:Net_Income_Cont_Operations 4.371e-11 1.08e-10 0.405 0.685 -1.68e-10 2.55e-10
Net_Cash_Flow_Operating:Net_Receivables -1.005e-11 1.06e-11 -0.944 0.345 -3.09e-11 1.08e-11
Net_Cash_Flow_Operating:Operating_Income 6.296e-11 4.85e-11 1.298 0.194 -3.22e-11 1.58e-10
Net_Cash_Flow_Operating:Other_Current_Assets -1.231e-11 2.34e-11 -0.527 0.599 -5.82e-11 3.36e-11
Net_Cash_Flow_Operating:Other_Liabilities -1.115e-11 1e-11 -1.113 0.266 -3.08e-11 8.5e-12
Net_Cash_Flow_Operating:Retained_Earnings 4.767e-14 2.69e-12 0.018 0.986 -5.24e-12 5.34e-12
Net_Cash_Flow_Operating:Sales_General_and_Admin 2.95e-11 9.56e-12 3.085 0.002 1.07e-11 4.83e-11
Net_Cash_Flow_Operating:Total_Assets 3.235e-10 1.21e-09 0.268 0.789 -2.05e-09 2.69e-09
Net_Cash_Flow_Operating:Total_Current_Assets -6.019e-12 7.09e-12 -0.849 0.396 -1.99e-11 7.88e-12
Net_Cash_Flow_Operating:Total_Current_Liabilities -1.214e-11 1.21e-11 -1.006 0.315 -3.58e-11 1.15e-11
Net_Cash_Flow_Operating:Total_Equity -2.401e-05 2.14e-05 -1.120 0.263 -6.61e-05 1.81e-05
Net_Cash_Flow_Operating:Total_Liabilities -2.401e-05 2.14e-05 -1.120 0.263 -6.61e-05 1.81e-05
Net_Cash_Flow_Operating:Total_Liabilities_and_Equity 2.401e-05 2.14e-05 1.120 0.263 -1.81e-05 6.61e-05
Net_Cash_Flow_Operating:Total_Revenue -2.521e-12 1.4e-12 -1.806 0.071 -5.26e-12 2.18e-13
Net_Income:Net_Income_Applicable_to_Common_Shareholders -3.85e-11 4.28e-11 -0.899 0.369 -1.23e-10 4.56e-11
Net_Income:Net_Income_Cont_Operations -2.648e-09 1.14e-08 -0.232 0.817 -2.51e-08 1.98e-08
Net_Income:Net_Receivables 1.176e-08 3.37e-09 3.486 0.001 5.14e-09 1.84e-08
Net_Income:Operating_Income -2.434e-09 5.84e-09 -0.417 0.677 -1.39e-08 9.03e-09
Net_Income:Other_Current_Assets 1.511e-08 3.95e-09 3.828 0.000 7.36e-09 2.29e-08
Net_Income:Other_Liabilities -1.6e-09 1.31e-09 -1.224 0.221 -4.17e-09 9.66e-10
Net_Income:Retained_Earnings -1.704e-09 5.44e-10 -3.132 0.002 -2.77e-09 -6.36e-10
Net_Income:Sales_General_and_Admin 1.263e-09 3.45e-09 0.366 0.715 -5.51e-09 8.04e-09
Net_Income:Total_Assets -6.189e-08 2.81e-08 -2.200 0.028 -1.17e-07 -6.7e-09
Net_Income:Total_Current_Assets -6.054e-09 1.76e-09 -3.443 0.001 -9.5e-09 -2.6e-09
Net_Income:Total_Current_Liabilities 3.642e-09 2.65e-09 1.374 0.170 -1.56e-09 8.85e-09
Net_Income:Total_Equity 1.367e-05 1.45e-05 0.943 0.346 -1.48e-05 4.21e-05
Net_Income:Total_Liabilities 1.367e-05 1.45e-05 0.943 0.346 -1.48e-05 4.21e-05
Net_Income:Total_Liabilities_and_Equity -1.361e-05 1.45e-05 -0.939 0.348 -4.2e-05 1.48e-05
Net_Income:Total_Revenue -7.967e-10 5.43e-10 -1.466 0.143 -1.86e-09 2.69e-10
Net_Income_Applicable_to_Common_Shareholders:Net_Income_Cont_Operations 2.864e-09 1.14e-08 0.252 0.801 -1.95e-08 2.52e-08
Net_Income_Applicable_to_Common_Shareholders:Net_Receivables -1.175e-08 3.37e-09 -3.483 0.001 -1.84e-08 -5.13e-09
Net_Income_Applicable_to_Common_Shareholders:Operating_Income 2.781e-09 5.85e-09 0.475 0.635 -8.7e-09 1.43e-08
Net_Income_Applicable_to_Common_Shareholders:Other_Current_Assets -1.513e-08 3.94e-09 -3.839 0.000 -2.29e-08 -7.4e-09
Net_Income_Applicable_to_Common_Shareholders:Other_Liabilities 1.56e-09 1.3e-09 1.203 0.229 -9.84e-10 4.1e-09
Net_Income_Applicable_to_Common_Shareholders:Retained_Earnings 1.732e-09 5.44e-10 3.181 0.002 6.64e-10 2.8e-09
Net_Income_Applicable_to_Common_Shareholders:Sales_General_and_Admin -1.01e-09 3.45e-09 -0.293 0.770 -7.79e-09 5.76e-09
Net_Income_Applicable_to_Common_Shareholders:Total_Assets 3.89e-08 2.45e-08 1.586 0.113 -9.24e-09 8.7e-08
Net_Income_Applicable_to_Common_Shareholders:Total_Current_Assets 6.037e-09 1.76e-09 3.435 0.001 2.59e-09 9.49e-09
Net_Income_Applicable_to_Common_Shareholders:Total_Current_Liabilities -3.55e-09 2.65e-09 -1.341 0.180 -8.75e-09 1.65e-09
Net_Income_Applicable_to_Common_Shareholders:Total_Equity -6.65e-06 2.37e-05 -0.280 0.779 -5.32e-05 3.99e-05
Net_Income_Applicable_to_Common_Shareholders:Total_Liabilities -6.649e-06 2.37e-05 -0.280 0.779 -5.32e-05 3.99e-05
Net_Income_Applicable_to_Common_Shareholders:Total_Liabilities_and_Equity 6.611e-06 2.37e-05 0.279 0.780 -3.99e-05 5.32e-05
Net_Income_Applicable_to_Common_Shareholders:Total_Revenue 8.014e-10 5.43e-10 1.475 0.140 -2.65e-10 1.87e-09
Net_Income_Cont_Operations:Net_Receivables -2.288e-10 1.2e-10 -1.914 0.056 -4.63e-10 5.78e-12
Net_Income_Cont_Operations:Operating_Income -2.388e-10 2.17e-10 -1.101 0.271 -6.65e-10 1.87e-10
Net_Income_Cont_Operations:Other_Current_Assets -3.642e-11 2.18e-10 -0.167 0.867 -4.64e-10 3.91e-10
Net_Income_Cont_Operations:Other_Liabilities -1.467e-10 6.58e-11 -2.229 0.026 -2.76e-10 -1.76e-11
Net_Income_Cont_Operations:Retained_Earnings -3.337e-11 1.68e-11 -1.984 0.048 -6.64e-11 -3.64e-13
Net_Income_Cont_Operations:Sales_General_and_Admin -4.763e-10 1.85e-10 -2.577 0.010 -8.39e-10 -1.14e-10
Net_Income_Cont_Operations:Total_Assets -1.359e-08 1.12e-08 -1.214 0.225 -3.56e-08 8.38e-09
Net_Income_Cont_Operations:Total_Current_Assets -2.531e-10 9e-11 -2.810 0.005 -4.3e-10 -7.64e-11
Net_Income_Cont_Operations:Total_Current_Liabilities 1.722e-10 9.25e-11 1.862 0.063 -9.28e-12 3.54e-10
Net_Income_Cont_Operations:Total_Equity -3.483e-05 1.6e-05 -2.180 0.030 -6.62e-05 -3.47e-06
Net_Income_Cont_Operations:Total_Liabilities -3.483e-05 1.6e-05 -2.180 0.030 -6.62e-05 -3.47e-06
Net_Income_Cont_Operations:Total_Liabilities_and_Equity 3.485e-05 1.6e-05 2.180 0.029 3.49e-06 6.62e-05
Net_Income_Cont_Operations:Total_Revenue 1.551e-11 1.93e-11 0.804 0.422 -2.24e-11 5.34e-11
Net_Receivables:Operating_Income -3.133e-11 5.21e-11 -0.602 0.548 -1.33e-10 7.08e-11
Net_Receivables:Other_Current_Assets -2.138e-11 1.59e-11 -1.342 0.180 -5.26e-11 9.89e-12
Net_Receivables:Other_Liabilities 1.629e-11 6.83e-12 2.387 0.017 2.9e-12 2.97e-11
Net_Receivables:Retained_Earnings 7.145e-12 1.61e-12 4.427 0.000 3.98e-12 1.03e-11
Net_Receivables:Sales_General_and_Admin -1.838e-11 1.29e-11 -1.422 0.155 -4.37e-11 6.98e-12
Net_Receivables:Total_Assets 2.019e-10 2.23e-09 0.090 0.928 -4.18e-09 4.58e-09
Net_Receivables:Total_Current_Assets -5.607e-13 2.54e-12 -0.221 0.825 -5.55e-12 4.42e-12
Net_Receivables:Total_Current_Liabilities 2.149e-11 4.78e-12 4.495 0.000 1.21e-11 3.09e-11
Net_Receivables:Total_Equity 2.688e-05 1.69e-05 1.591 0.112 -6.28e-06 6e-05
Net_Receivables:Total_Liabilities 2.688e-05 1.69e-05 1.591 0.112 -6.28e-06 6e-05
Net_Receivables:Total_Liabilities_and_Equity -2.688e-05 1.69e-05 -1.591 0.112 -6e-05 6.28e-06
Net_Receivables:Total_Revenue 3.902e-12 1.08e-12 3.617 0.000 1.78e-12 6.02e-12
Operating_Income:Other_Current_Assets 1.692e-10 9.55e-11 1.772 0.077 -1.82e-11 3.57e-10
Operating_Income:Other_Liabilities -9.862e-11 3.67e-11 -2.689 0.007 -1.71e-10 -2.66e-11
Operating_Income:Retained_Earnings 1.506e-11 7.73e-12 1.949 0.052 -9.93e-14 3.02e-11
Operating_Income:Sales_General_and_Admin -4.276e-11 3.77e-11 -1.134 0.257 -1.17e-10 3.13e-11
Operating_Income:Total_Assets -3.184e-09 2.28e-09 -1.394 0.163 -7.66e-09 1.3e-09
Operating_Income:Total_Current_Assets -1.066e-10 2.92e-11 -3.648 0.000 -1.64e-10 -4.93e-11
Operating_Income:Total_Current_Liabilities -4.243e-11 3.3e-11 -1.285 0.199 -1.07e-10 2.23e-11
Operating_Income:Total_Equity -3.96e-05 2.62e-05 -1.513 0.131 -9.1e-05 1.18e-05
Operating_Income:Total_Liabilities -3.96e-05 2.62e-05 -1.513 0.131 -9.1e-05 1.18e-05
Operating_Income:Total_Liabilities_and_Equity 3.96e-05 2.62e-05 1.513 0.130 -1.17e-05 9.1e-05
Operating_Income:Total_Revenue 4.968e-12 4.24e-12 1.171 0.242 -3.36e-12 1.33e-11
Other_Current_Assets:Other_Liabilities 1.01e-11 1.31e-11 0.772 0.440 -1.56e-11 3.58e-11
Other_Current_Assets:Retained_Earnings 9.144e-12 3.6e-12 2.540 0.011 2.08e-12 1.62e-11
Other_Current_Assets:Sales_General_and_Admin 2.621e-11 2.5e-11 1.047 0.295 -2.29e-11 7.53e-11
Other_Current_Assets:Total_Assets 1.587e-09 2.64e-09 0.602 0.547 -3.59e-09 6.76e-09
Other_Current_Assets:Total_Current_Assets 7.213e-12 8.96e-12 0.805 0.421 -1.04e-11 2.48e-11
Other_Current_Assets:Total_Current_Liabilities 6.416e-12 1.06e-11 0.605 0.545 -1.44e-11 2.72e-11
Other_Current_Assets:Total_Equity 4.284e-06 7.95e-06 0.539 0.590 -1.13e-05 1.99e-05
Other_Current_Assets:Total_Liabilities 4.284e-06 7.95e-06 0.539 0.590 -1.13e-05 1.99e-05
Other_Current_Assets:Total_Liabilities_and_Equity -4.285e-06 7.95e-06 -0.539 0.590 -1.99e-05 1.13e-05
Other_Current_Assets:Total_Revenue 7.436e-12 2.8e-12 2.656 0.008 1.94e-12 1.29e-11
Other_Liabilities:Retained_Earnings -1.185e-11 1.74e-12 -6.795 0.000 -1.53e-11 -8.43e-12
Other_Liabilities:Sales_General_and_Admin -2.06e-11 1.01e-11 -2.030 0.043 -4.05e-11 -6.88e-13
Other_Liabilities:Total_Assets -8.503e-10 1.19e-09 -0.716 0.474 -3.18e-09 1.48e-09
Other_Liabilities:Total_Current_Assets 1.792e-12 4.4e-12 0.407 0.684 -6.84e-12 1.04e-11
Other_Liabilities:Total_Current_Liabilities -1.429e-11 6.06e-12 -2.358 0.019 -2.62e-11 -2.4e-12
Other_Liabilities:Total_Equity -7.293e-05 2.47e-05 -2.951 0.003 -0.000 -2.44e-05
Other_Liabilities:Total_Liabilities -7.293e-05 2.47e-05 -2.951 0.003 -0.000 -2.44e-05
Other_Liabilities:Total_Liabilities_and_Equity 7.293e-05 2.47e-05 2.951 0.003 2.44e-05 0.000
Other_Liabilities:Total_Revenue -1.374e-12 1.27e-12 -1.086 0.278 -3.86e-12 1.11e-12
Retained_Earnings:Sales_General_and_Admin -4.137e-12 2.26e-12 -1.830 0.068 -8.57e-12 2.99e-13
Retained_Earnings:Total_Assets 1.263e-10 4.6e-10 0.275 0.784 -7.77e-10 1.03e-09
Retained_Earnings:Total_Current_Assets -2.488e-12 1.26e-12 -1.979 0.048 -4.96e-12 -2.08e-14
Retained_Earnings:Total_Current_Liabilities -1.581e-12 1.76e-12 -0.901 0.368 -5.03e-12 1.86e-12
Retained_Earnings:Total_Equity 1.442e-05 0.000 0.139 0.889 -0.000 0.000
Retained_Earnings:Total_Liabilities 1.442e-05 0.000 0.139 0.889 -0.000 0.000
Retained_Earnings:Total_Liabilities_and_Equity -1.442e-05 0.000 -0.139 0.889 -0.000 0.000
Retained_Earnings:Total_Revenue 3.044e-13 3.44e-13 0.884 0.377 -3.71e-13 9.8e-13
Sales_General_and_Admin:Total_Assets -1.708e-09 2.19e-09 -0.780 0.435 -6e-09 2.59e-09
Sales_General_and_Admin:Total_Current_Assets 3.434e-12 7.44e-12 0.461 0.645 -1.12e-11 1.8e-11
Sales_General_and_Admin:Total_Current_Liabilities 7.731e-12 8.65e-12 0.894 0.372 -9.25e-12 2.47e-11
Sales_General_and_Admin:Total_Equity -1.871e-05 1.45e-05 -1.294 0.196 -4.71e-05 9.67e-06
Sales_General_and_Admin:Total_Liabilities -1.871e-05 1.45e-05 -1.294 0.196 -4.71e-05 9.67e-06
Sales_General_and_Admin:Total_Liabilities_and_Equity 1.871e-05 1.45e-05 1.294 0.196 -9.67e-06 4.71e-05
Sales_General_and_Admin:Total_Revenue 1.85e-12 9.04e-13 2.046 0.041 7.59e-14 3.62e-12
Total_Assets:Total_Current_Assets -7.941e-10 1.95e-09 -0.408 0.684 -4.62e-09 3.03e-09
Total_Assets:Total_Current_Liabilities 7.281e-11 3.83e-10 0.190 0.849 -6.78e-10 8.24e-10
Total_Assets:Total_Equity 2.032e-05 0.000 0.150 0.881 -0.000 0.000
Total_Assets:Total_Liabilities 2.032e-05 0.000 0.150 0.881 -0.000 0.000
Total_Assets:Total_Liabilities_and_Equity -2.032e-05 0.000 -0.150 0.881 -0.000 0.000
Total_Assets:Total_Revenue -2.412e-10 1.61e-10 -1.497 0.135 -5.57e-10 7.49e-11
Total_Current_Assets:Total_Current_Liabilities -1.522e-11 3.6e-12 -4.224 0.000 -2.23e-11 -8.15e-12
Total_Current_Assets:Total_Equity -0.0001 8.02e-05 -1.739 0.082 -0.000 1.79e-05
Total_Current_Assets:Total_Liabilities -0.0001 8.02e-05 -1.739 0.082 -0.000 1.79e-05
Total_Current_Assets:Total_Liabilities_and_Equity 0.0001 8.02e-05 1.739 0.082 -1.79e-05 0.000
Total_Current_Assets:Total_Revenue -7.688e-15 1.07e-12 -0.007 0.994 -2.11e-12 2.1e-12
Total_Current_Liabilities:Total_Equity -3.623e-05 3.61e-05 -1.003 0.316 -0.000 3.46e-05
Total_Current_Liabilities:Total_Liabilities -3.623e-05 3.61e-05 -1.003 0.316 -0.000 3.46e-05
Total_Current_Liabilities:Total_Liabilities_and_Equity 3.623e-05 3.61e-05 1.003 0.316 -3.46e-05 0.000
Total_Current_Liabilities:Total_Revenue 2.683e-13 9.27e-13 0.289 0.772 -1.55e-12 2.09e-12
Total_Equity:Total_Liabilities -1.342e-12 9.89e-13 -1.357 0.175 -3.28e-12 5.99e-13
Total_Equity:Total_Liabilities_and_Equity -2.53e-10 5.8e-10 -0.436 0.663 -1.39e-09 8.86e-10
Total_Equity:Total_Revenue -0.0007 0.000 -2.306 0.021 -0.001 -0.000
Total_Liabilities:Total_Liabilities_and_Equity 4.865e-11 7.2e-10 0.068 0.946 -1.36e-09 1.46e-09
Total_Liabilities:Total_Revenue -0.0007 0.000 -2.306 0.021 -0.001 -0.000
Total_Liabilities_and_Equity:Total_Revenue 0.0007 0.000 2.306 0.021 0.000 0.001
Omnibus: 980.307 Durbin-Watson: 1.339
Prob(Omnibus): 0.000 Jarque-Bera (JB): 34319.059
Skew: 3.122 Prob(JB): 0.00
Kurtosis: 27.394 Cond. No. 2.85e+16


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.85e+16. This might indicate that there are
strong multicollinearity or other numerical problems.
Observations¶

-The interaction model has an improved R-squared value of 0.94. Which means that it can explain 94% of the variation of the Estimated_Shares_Outstanding value

Briefly explain why interaction terms might be important in the context of predicting Estimated Shares Outstanding using fundamental financial metrics¶
  • In this context, a lot of variables may not just be additive and interactions can capture this complex relation between variables to better capture the pattern. Here we can also see as the interaction terms had a superior R-squared value (although a smaller F-statistic making the model less stable.

Model Evaluation with Interaction Terms:¶

Evaluate the performance of this new model with interaction terms. Compare it with the performance of the original model without interaction terms using appropriate metrics¶

In [93]:
#comparing the R-squared value of the original model with the interaction model

print(f'The R-squared value of the original model is : {model25.rsquared}')
print(f'The R-squared value of the interaction model is : {model25_interaction.rsquared}')
The R-squared value of the original model is : 0.7836502719097538
The R-squared value of the interaction model is : 0.9404350419868118
  • We can see here that the R-squared value has significantly increased when we added the quadratic interaction terms.

Discuss any significant changes in the model's performance or the coefficients of the predictors¶

  • Interaction terms increased the overall fit of the model, increasing the R-square from 0.78 to 0.94
  • The F-statistic for the interaction model decreased since we added more variables.

FDR Analysis with Interaction Terms¶

Create a histogram of the p-values for the new model including interaction terms. Discuss any noticeable differences from the histogram you created for the original model.¶

In [94]:
p_values25_int = model25_interaction.pvalues
plt.figure(figsize=(5,5))
plt.hist(p_values25_int, bins=50, edgecolor='black')
plt.show()
Discuss any noticeable differences from the histogram you created for the original model.¶
  • One major noticable thing in the histogram of the interaction model p-values is the skewness is even more visible, which shows that more easier discovery of the True predictor variables.

Apply the Benjamini-Hochberg (BH) procedure to control the False Discovery Rate(FDR) with a q-value of 0.1.¶

How many significant predictors are identified now, including both main effects and interaction effects?

In [95]:
fdr(p_values25_int ,0.1, plotit=True)
Alpha: 0.024741481334641803
Out[95]:
(0.024741481334641803, 81)
  • We discovered 81 significant predictors out of which 10% can be false, which means about 73 predictors are actually true discoveries.

  • This also is alligned with the observation that the interaction models p-value histogram plot - more extreme skewness.

Compare these results with those obtained from the original model. Discuss the impact of including interaction terms on the number of discoveries and the control of the FDR¶

  • In the original model we recognised 11 significant predictors out of which 10 were true predictors based on the FDR rate
  • In the interaction model we recognised 81 significant predictors out of which 73 were true predictors based on the FDR rate
  • Including interaction terms really boosts our models capability to discover more true significant predictors compared to the normal model.